Portmaster lockups (T10011)

Jon Green - User Support (jon@worf.netins.net)
Tue, 07 Nov 1995 12:00:57 -0600

Note to Livingston: This is in reference to ticket number 10011.
Note to everyone else: This is going to be a long message. :/

We are still having problems with all five of our Portmasters locking up. To
recap, here are the symptoms:

Unit will stop responding to all connections. Serial ports and ethernet
ports are both unresponsive. Ethernet LED will show heartbeat, but nothing
else. Diagnostic LED will be solid.

Unit will usually come back up by itself after 15-45 minutes. After it
comes back, uptime shows that the unit did NOT reboot.

All Portmasters are connected to MultiTech racks, with a mixture of
MT2834MR and MT1432MR modems. All Portmasters except one have all 30 ports
connected. Interestingly, the one that doesn't have all ports full locks up
MUCH less often than the others.

Connecting a terminal to a serial port and doing 'set console' was
unproductive. Nothing showed up immediately before a lockup, and all we
really saw were several "restarting ethernet after buffer fill" messages.

We have three other Portmasters located at remote sites. They are
connected to MultiTech 2834ZDX standalone modems. These three units are
NOT experiencing the lockup problems that our central site is experiencing.
These three units also get much lower usage than the central site; they
have only 5-15 modems on them.

Livingston suggested the problem may be with our Synoptics Ethernet switch.
We had each Portmaster plugged into a separate port of the switch, so from
the view of the Portmaster, it sat alone on the ethernet. To test out this
theory, I moved all the Portmasters to a Synoptics model 800 10baseT hub,
which then plugged into the ethernet switch. I did this yesterday, and
today one of the units locked up again. While it was in a locked state,
I started pulling out modem cards from the rack, effectively disconnecting
everything from the serial ports on the Portmaster. After pulling out
the card containing the 15th modem, the Portmaster came back to life. I
don't know if it was coincidence or not; I'll have to do further testing to
tell.

Now, the big question. Nobody else in the world seems to be having this
same problem, so I need to figure out what is unique to us that can be
causing this. I came up with a few things that would be common to all
the Portmasters:

1. Line power - They are all running on conditioned power from a Best UPS.
We have also tried moving a couple units to commercial power with no
effect.

2. Ethernet - See above

3. Modems - As I understand, others are using MultiTech racks with Portmasters,
with no problems. Does anyone have any ideas on some kind of error condition
in the modems that would cause the Portmaster to crap out? What if a modem
suddenly locked up and stopped accepting input from the serial port? I assume
data in the Portmaster would be buffered.. how big is this buffer? What
happens when it fills up? If one of the modems crapped out and sent a
voltage spike through the serial port, what would the Portmaster do?

4. The port configurations - Livingston has looked over these and didn't see
anything wrong. All ports are configured the same, and are set up to
do a telnet to a host as soon as a modem connection is established. Here
is a sample port configuration:

Active Configuration Default Configuration (* = Host
-------------------- --------------------- Can Override)
Port Type: Login Login
Login Service: Telnet Telnet
Baud Rates: 57600 57600,57600,57600
Databits: 8 8
Stopbits: 1 1
Parity: none none
Flow Control: RTS/CTS RTS/CTS
Modem Control: on on
Hosts: ins.netins.net default

Terminal Type: vt100
Login Prompt: $hostname login:
Autolog Name: dialup

Can anyone else think of things I can look at to help track down this
problem? We're starting to get desparate here.. if these problems keep
up, we're going to have to dump the Portmasters and buy something else.
I'm convinced the Portmaster is a good product, and there must be *something*
at this site that is hitting a bug in the Portmaster. I need to find out
what that something is and eliminate it, or get the bug fixed. Whatever
the case, I need to get it done fast.

All ideas welcomed... Thanks. :)

-Jon

-------------------------------------------------------------------------
* Jon Green * GIT d- s+:+ a-- C++$ ULO++++$ P+ L++ E W+(--) *
* jon@netins.net * N++ K w(--) O- M-- V+++$(--) PS PE++ Y+ PGP+ *
* INS User Support * t+ 5 X(+) R- tv+ b+ DI++ D++ G+ e+ h r++ y+ *
------------------------------------------------------------------------