Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

CRC, Frame, Collisions and other bad stuff on my LAN..

Status
Not open for further replies.

ITsmyfault

IS-IT--Management
Oct 24, 2002
282
US
Hi -

Equip: Baystack 450-24T's in 3 cascaded pairs, 1 is core, other 2 pairs are access-layer (workstations..) we use fiber uplink from core to access.
Hosts range from OS/2 to Cisco routers to NT machines, Sun boxes, etc. NIC's by HP, Intel & 3Com.
Cabling is CAT6, new (2 yrs old)

Network is slow and has a very strange array of errors. ex: my default gateway (Cisco router) has tons of "filtered" packets (packets that got there but were not forwarded). I'd expect "filtered" to reflect broadcast - but other ports have 0 filtered packets.. I have frame errors on the ports with my HP nics.. massive collisions on the OS/2 boxes port (it's only running IPX AFAIK) The Sun stuff is fine, no errors or other bad stuff. The HP nics seem to be plagued with either Frame, CRC or FCS errors. We're running IP, IPX, OSPF & BGP.

I have tried swapping nics, swapping cables, swapping switches, and playing with auto-neg vs. forced speed and duplex. I've wound up back on auto on every port.. forcing duplex and speed on my servers for the most part caused major problems. It seems like a duplex issue at first as pings are fast but file movement is just pathetic. But I can't seem to get a good setting where host & switch are both happy. I sniff a port and while the switch racks up errors, the software does not indicate the same thing. It's creepy.
I have checked out the drivers on my nics, have the latest firmware and software in my switches.. I'm running out of things to rip and replace. Nortel tech support can't really help bc I can't prove or disprove it's switch related (fair enough I suppose) Somethings the thing runs great, sometimes it takes 60sec just to open a small file.. I'm losing my mind.

Anyone have a suggestion? I am not sure what to look at next, but something is wrong for sure.

Thanks for any suggestions you may have. I appreciate your time and hope I can help out with stuff on the list.

- JoeP
 
A filtered packet indicates a hub or similar device that has ports that talk among themselves: when the packet gets to the switch it has already hit it's destination.

I would try to get full duplex to work on the OS/2 box, that usually lowers collisions.

I find auto to be the best choice in 24 out of 25 cases, sometimes I have to force 10 meg on a bad wire, or run a fiber converter at full, but for NICs auto is almost always 'best'.

I have no experience with HP NICs, but can reccomend 3coms. I tried to remain child-like, all I acheived was childish.
 
it realy sounds like the NICs with the errors are set to a fixed speed/duplex, while the switches are set to auto.

Auto rarely works unless both sides are set to auto-negotiate

Auto to Auto works if the wire is good
Auto to Full horrible throughput and errors
Auto to Half works, there will be collisions
Full to Full works if the wire is good
Full to Half horrible throughput and errors
Half to Half works, there will be collisions


If the switch was set to half and the NIC to full you would see late collisions, but with the switch trying to do full and the NIC at half, you will get integrety errors like FCS and frame and too short I tried to remain child-like, all I acheived was childish.
 
OK notice port 19 here, some wonderful individual has disabled AUTO at the switch and not enabled Full at the NIC, does this look familier?


Port Statistics
Unit: [ 1 ] Port: [ 19 ]
Received Transmitted
----------------------------------------------------------------------------
Packets: 20890763 Packets: 34355899
Multicasts: 0 Multicasts: 3902754
Broadcasts: 15089 Broadcasts: 17997239
Total Octets: 72167708 Total Octets: 991064152
Lost Packets: 0 Lost Packets: 0
Packets 64 bytes: 2323726 Packets 64 bytes: 18605899
65-127 bytes 1579598 65-127 bytes 2137814
128-255 bytes 1166890 128-255 bytes 1763043
256-511 bytes 1303998 256-511 bytes 981192
512-1023 bytes 966298 512-1023 bytes 356265
1024-1518 bytes 13170370 1024-1518 bytes 10511686
Frame Errors: 1018301 Collisions: 0
Undersized Packets: 0 Single Collisions: 0
Oversized Packets: 0 Multiple Collisions: 0
Filtered Packets: 0 Excessive Collisions: 0
Flooded Packets: 0 Deferred Packets: 0
FCS Errors: 1016376 Late Collisions: 0

Port Configuration
Unit: [ 1 ]
Port Trunk Status Link LnkTrap Autonegotiation Speed Duplex
---- ----- ------------ ----- ------- ---------------
-----------------
15 [ Enabled ] Down [ On ] [ Enabled ] []
16 [ Enabled ] Up [ On ] [ Enabled ] [ 100Mbs / Full ]
17 [ Enabled ] Up [ On ] [ Enabled ] [ 10Mbs / Half ]
18 [ Enabled ] Up [ On ] [ Enabled ] [ 100Mbs / Full ]
19 [ Enabled ] Up [ On ] [ Disabled ] [ 100Mbs / Full ]
20 [ Enabled ] Up [ On ] [ Enabled ] [ 100Mbs / Full ]
21 [ Enabled ] Down [ On ] [ Enabled ] [ ]
22 [ Enabled ] Up [ On ] [ Enabled ] [ 100Mbs / Full ]
23 [ Enabled ] Down [ On ] [ Enabled ] [ ]
24 [ Enabled ] Down [ On ] [ Enabled ] [ ]
25 [ Enabled ] Up [ On ] Enabled 1000Mbs/ Full
Switch [ Enable ] [ On ] [ Enable ] [ 10Mbs /
Half ]
Stack [ Enable ] [ On ] [ Enable ] [ 10Mbs /
Half ]
I tried to remain child-like, all I acheived was childish.
 
Here is the look of a perfectly fine setup with a server doing 10 / Half, all you can do is encorage the 'owner' of the server that 10 / Half is not up to par.


Port Statistics
Unit: [ 6 ] Port: [ 10 ]
Received Transmitted
--------------------------------------
--------------------------------------
Packets: 33295246 Packets: 42391024
Multicasts: 0 Multicasts: 4587079
Broadcasts: 21108 Broadcasts: 5805069
Total Octets: 2220522083 Total Octets: 744392813
Lost Packets: 0 Lost Packets: 0
Packets 64 bytes: 4033469 Packets 64 bytes: 20150241
65-127 bytes 9233993 65-127 bytes 6654767
128-255 bytes 4986309 128-255 bytes 14612932
256-511 bytes 285918 256-511 bytes 723066
512-1023 bytes 736119 512-1023 bytes 119398
1024-1518 bytes 14019438 1024-1518 bytes 130620
Frame Errors: 0 Collisions: 5413516
Undersized Packets: 0 Single Collisions: 1144507
Oversized Packets: 0 Multiple Collisions: 931614
Filtered Packets: 0 Excessive Collisions: 1
Flooded Packets: 132 Deferred Packets: 105182
FCS Errors: 0 Late Collisions: 0

Port Configuration
Unit: [ 6 ]
Port Trunk Status Link LnkTrap Autonegotiation Speed Duplex
---- ----- ------------ ----- ------- ---------------
-----------------
1 [ Enabled ] Down [ On ] [ Enabled ] [ ]
2 [ Enabled ] Up [ On ] [ Enabled ] [ 100Mbs / Full ]
3 [ Enabled ] Up [ On ] [ Enabled ] [ 100Mbs / Full ]
4 [ Enabled ] Down [ On ] [ Enabled ] [ ]
5 [ Enabled ] Up [ On ] [ Enabled ] [ 100Mbs / Full ]
6 [ Enabled ] Up [ On ] [ Enabled ] [ 100Mbs / Full ]
7 [ Enabled ] Up [ On ] [ Enabled ] [ 100Mbs / Full ]
8 [ Enabled ] Up [ On ] [ Enabled ] [ 100Mbs / Full ]
9 [ Enabled ] Up [ On ] [ Enabled ] [ 100Mbs / Full ]
10 [ Enabled ] Up [ On ] [ Enabled ] [ 10Mbs / Half ]
11 [ Enabled ] Up [ On ] [ Enabled ] [ 100Mbs / Full ]
12 [ Enabled ] Up [ On ] [ Enabled ] [ 10Mbs / Half ]
13 [ Enabled ] Up [ On ] [ Enabled ] [ 10Mbs / Half ]
14 [ Enabled ] Up [ On ] [ Enabled ] [ 100Mbs / Full ] I tried to remain child-like, all I acheived was childish.
 
1. - Thanks! (maybe you should C+P that into a bay networks switch FAQ. ;)
2. Some of that does look very familiar.. my OS/2 box is 10Mbps and has the most coll.. which does make sense.
The HP nics are in our Novell servers and we have tried setting the servers to forced 100/full with the switch set the same way but that did not work for some reason. I don't get that many frame errors, but I get enough to just kill some apps. At times it's been bad enough to crash database apps and hurt our backup job. Here's the main server's port (normally it would use multiple nics, but until I fix this we're on 1..)
Port Statistics
Unit: [ 1 ] Port: [ 1 ]
Received Transmitted
-------------------------------------- --------------------------------------
Packets: 46274009 Packets: 87539430
Multicasts: 365 Multicasts: 1402973
Broadcasts: 86063 Broadcasts: 28517494
Total Octets: 2053702311 Total Octets: 507871730
Lost Packets: 0 Lost Packets: 0
Packets 64 bytes: 13422900 Packets 64 bytes: 11531428
65-127 bytes 14459626 65-127 bytes 33948647
128-255 bytes 11100409 128-255 bytes 7709678
256-511 bytes 1204139 256-511 bytes 12780968
512-1023 bytes 2064172 512-1023 bytes 370768
1024-1518 bytes 4022642 1024-1518 bytes 21197941
Frame Errors: 103 Collisions: 0
Undersized Packets: 0 Single Collisions: 0
Oversized Packets: 0 Multiple Collisions: 0
Filtered Packets: 0 Excessive Collisions: 0
Flooded Packets: 34 Deferred Packets: 0
FCS Errors: 0 Late Collisions: 0

I know 103 doesn't seem like much, but it's making a mess of things.. IIRC a frame error will reset the session which would account for the slowness perhaps? The switch is set for auto, and is showing 100/Full. The server is configured for auto as well (HP drivers for Novell assume auto unless you specify). I think I'm going to try another nic mfr and see what happens. There may just be some odd issue with these nics and this switch? FWIW Nortel tech support recommended defaulting the switch and reconfiguring. The engineer I spoke with said that sometimes that cures wierd issues. Just thought I'd toss that out as well.. I will probably try that too (my config is simple..) Will post my results either way.
Thanks again!
- JoeP
 
Just a thought...... have you checked the mtu size on your servers and the Cisco box? Rick Harris
SC Dept of Public Safety-DMV
Network Operations
 
one side being set half and one side set full kills throughput, but pings work I tried to remain child-like, all I acheived was childish.
 
Some status to report...
had a case open with Veritas bc our backup software went on the fritz after a service pack.. long story short, their tech looked at IRQ settings and thought we had an issue with IRQ 15 as well as IRQ sharing. IRQ15 was traditionally a "don't use" IRQ with Netware up through NW 4.11 but from 5.0 on it has not been. We did have a sharing issue and a priority issue though on our main servers. We had a NIC sharing an IRQ with the onboard SCSI controller on 2 of the 3 cluster servers (due to the slot it was in), and the fiberchannel had a lower interrupt.. oops. Never let the server set stuff to "auto" !! Just what I needed,more bite marks. ;)

Anyhow, I fixed that (hardcoded fiberchannel on IRQ9, SCSI on 11, Nics on IRQ's 3 & 4 and no IRQ sharing and now my errors from the server nics are down to 1 in 500,000 packets. (about 10x better) CPU util has dropped as well esp during the backup job. I had only 14 errors yesterday, all classified as "alignment errors" by the Nortel Device Mgr software (under "ethernet errors" tab)

Not 100% perfect, but we're getting there.. thanks! -JoeP
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top