Re: (PM) DMA Buffer Full (fwd)

Mark E. Mallett (mem@mv.mv.com)
Thu, 5 Feb 1998 14:22:17 -0500 (EST)

>
> Once upon a time Gary Carr shaped the electrons to say...
> >Sorry for the last post in html. Here is the error I am getting.
> >S1 DMA: Reveive Buffer Full
>
> It means exactly what it says. The receive buffer on S1 filled up.
> It could be busting traffic, or there might be congestion on the port
> the traffic needs to flow 'out' of, backing it up so the buffer can't
> empty fast enough, etc. S1 and S3 are DMA so it shouldn't be overwhelmed
> in normal conditions.
>
> Questions -
>
> Are you running 3.7.2R? 3.5 and earlier have a known WAN port driver
> problem (let alone the security hole in anything earlier than 3.7).
>
> Which ports are you using, and what speeds? A 114 can use S1 and S3
> up to a full T1/E1 - but S2 and S4 should not be used in this case.
> They are not DMA, but rather interrupt driven and using all for flat
> out will backup traffic.
>
> If you are only using one of the T1/E1 ports fully, then you might be
> able to S2 and/or S4 also. A 114 was not desinged to have all 4 WAN ports
> running at full speed simultaneously.

I'm not the original poster, but this may relate to an old outstanding
problem we have:

We've got two parallel T1s connecting two IRX-114s each running 3.7.2R
and each with 1MB of RAM. We started with a single T1 but were
observing lots of CRC and Frame errors (as indicated by "show S3"), to
the tune of one error every 5-10 seconds, and since it was very heavily
used we figured we'd just put in a second T1 for parallelism-- for
backup and for use in multilink PPP. The high error rates lead to a
lot of TCP/IP session stalls, i.e. burstiness and poor performance.

With both T1s in use the error rate increases by about a factor of 10
or more, and the performance degrades substantially, so we tend to run
with only one of the ports enabled, thus only using one of the T1s.
In this mode, I've observed that one of the IRX-114's reports:

S3 DMA: Recieve Buffer Full (*)

about every 5-10 seconds, and every time this happens, we see an
increment of the Frame Error or CRC error counters (or both). The
traffic is heavy by about 3:1 in one direction, and it's the IRX on
the heavy receiving end that is getting the major error pattern.

When we turn up both T1s, we get about > 10x increase in the rate at
which the CRC and Framing errors increment, but the DMA Receive Buffer
errors go down almost to nothing. In their place, we start to see
things like this:

sync_device_check (S3): Restarting Receiver
sync_device_check (S1): Restarting Receiver
net: Bad wanted 537, got 91
net: Bad wanted 552, got 60
net: Bad wanted 552, got 128
net: Bad wanted 552, got 184
net: Bad wanted 1500, got 60
net: Bad wanted 552, got 40
sync_device_check (S3): Restarting Receiver
net: Bad wanted 223, got 40

... not as frequently as the DMA errors come in with a single T1,
maybe one message every 10-20 seconds, but the CRC and Framing error
rate goes way up, counting up one every second or so. I also note that
in multilink mode, the traffic appears to be split about 2:1 between
the ports, rather than being divided equally.

The ethernet on the IRX in question looks good, with a < 4% collision
rate.

telco has done hard tests on each T1, and has done passive monitoring
as well, and they appear clean. The T1s are on ports S1 and S3, and
nothing is on S2 and S4. We've also swapped out the equipment and the
cabling as you might imagine.

-mm-

(*) Perhaps it should be an RFE to correct the spelling in
that debug message :-)

-
To unsubscribe, email 'majordomo@livingston.com' with
'unsubscribe portmaster-users' in the body of the message.