Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Realtek 8169 gbit wrong speed, wrong duplex. 2

Status
Not open for further replies.

POPKORN

Technical User
Jan 10, 2005
95
US
hello all.

I have decided to post since I have spent the last 3days crawling the internet for an answer with no sucess.

My problem is with a Realtek 8169 gbit.
Its on an Ultra80.
I have tried to use ndd in the following ways

47 ndd -set /dev/rge0 adv_100fdx_cap 0
48 ndd -set /dev/rge0 adv_10fdx_cap 0
49 ndd -set /dev/rge0 adv_10hdx_cap 0
50 ndd -set /dev/rge0 adv_100hdx_cap 0
51 ndd -set /dev/rge0 adv_1000fdx_cap 1

with the following result

bash-3.00# dladm show-dev
hme0 link: unknown speed: 0 Mbps duplex: unknown
rge0 link: up speed: 10 Mbps duplex: half
bash-3.00#

I have no intent to use hme0 since its limited to 100mbit speed.

I even went as far as changing the gbit switch to a cisco2900XL to see if that would help but it had the same results. It just seems that is does no autonegotiate and every time is goes into 10mbit half duplex. Even after using ndd it would not even try to change as I was connected to the box via ssh and i could just keep executing command and it would not get interrupted like it should when you use ndd to change speed or link.

I do not see any error on dmsg nor on /var/adm/mes...

Here are a few snipets.

bash-3.00# cat /etc/driver_aliases|grep rge
rge "pci10ec,8169"
rge "pci10ec,8110"

dmesg

Mar 8 23:49:10 blabla.com mac: [ID 543131 kern.info] NOTICE: rge0/0 registered
Mar 8 23:49:10 blabla.com rge: [ID 801725 kern.info] NOTICE: rge0: link up 10Mbps Half_Duplex (initialized)


bash-3.00# prtdiag -v
System Configuration: Sun Microsystems sun4u Sun Ultra 80 UPA/PCI (4 X UltraSPARC-II 336MHz)
System clock frequency: 112 MHz
Memory size: 4096 Megabytes

========================= CPUs =========================

Run Ecache CPU CPU
Brd CPU Module MHz MB Impl. Mask
--- --- ------- ----- ------ ------ ----
0 0 0 336 4.0 US-II 2.0
0 1 1 336 4.0 US-II 2.0
0 2 2 336 4.0 US-II 2.0
0 3 3 336 4.0 US-II 2.0


========================= IO Cards =========================

Bus Freq
Brd Type MHz Slot Name Model
--- ---- ---- ---------- ---------------------------- --------------------
0 PCI 33 On-Board network-SUNW,hme
0 PCI 33 On-Board scsi-glm/disk (block) Symbios,53C875
0 PCI 33 On-Board scsi-glm/disk (block) Symbios,53C875
0 PCI 33 pcib ethernet-pci10ec,8169
0 PCI 33 pcia slot 1 fibre-channel-pci10df,f800
0 UPA 112 30 AFB, Double Buffered SUNW,540-3623

No failures found in System


-----------------------------------------------------------

Any help or suggestions would be greatly appreciated.

PS. I had this card on linux server on another machine, so I know the card is good.


PoPK0rN
 
forgot to mention its running solaris 10 with the latest patches.

P0P K
 
and forgot this one too

bash-3.00# ndd -set /dev/rge0 adv_autoneg_cap 0
operation failed: Permission denied
bash-3.00#

I am root, I get this same message when i try 1000 half duplex because it is not supported.


Thanks to anyone willing to help!

PoP K
 
Well,

I don't know why did't i think of this before but i checked the OBP ver and I had 3.23.

I upgraded to 3.31 which I believe is the greatest for the ultra80 and now I get this.

bash-3.00# ndd -set /dev/rge0 adv_autoneg_cap 0
operation failed: Permission denied
bash-3.00# ndd -set /dev/rge0 adv_100fdx_cap 0
bash-3.00# ndd -set /dev/rge0 adv_10fdx_cap 0
bash-3.00# ndd -set /dev/rge0 adv_10hdx_cap 0
bash-3.00# ndd -set /dev/rge0 adv_100hdx_cap 0
bash-3.00# ndd -set /dev/rge0 adv_1000fdx_cap 1
bash-3.00# dladm show-dev
hme0 link: unknown speed: 0 Mbps duplex: unknown
rge0 link: up speed: 100 Mbps duplex: full
bash-3.00#





For some reason its not taking the full 1000 but I did see that if I unplug it and plug it it goes into 1000 and then back to 100.


bash-3.00# Mar 9 03:21:14 solaris.stablehosting.net rge: NOTICE: rge0: link down
Mar 9 03:21:20 solaris.stablehosting.net rge: NOTICE: rge0: link up 1000Mbps Full_Duplex
Mar 9 03:21:21 solaris.stablehosting.net rge: NOTICE: rge0: link down
Mar 9 03:21:24 solaris.stablehosting.net rge: NOTICE: rge0: link up 1000Mbps Full_Duplex
Mar 9 03:21:25 solaris.stablehosting.net rge: NOTICE: rge0: link down
Mar 9 03:21:29 blabla.com rge: NOTICE: rge0: link up 1000Mbps Full_Duplex
Mar 9 03:21:30 blabla.com rge: NOTICE: rge0: link down
Mar 9 03:21:33 blabla.com rge: NOTICE: rge0: link up 1000Mbps Full_Duplex
Mar 9 03:21:34 blabla.com rge: NOTICE: rge0: link down
Mar 9 03:21:37 blabla.com rge: NOTICE: rge0: link up 1000Mbps Full_Duplex
Mar 9 03:21:38 blabla.com rge: NOTICE: rge0: link down
Mar 9 03:21:41 blabla.com rge: NOTICE: rge0: link up 100Mbps Full_Duplex




So now the question becomes what do I need to do to ensure that the link stays at 1000 during reboots. Right now its on the GIGE switch but it goes down to 100mbit.


Any help appreciated.


PopKorn







 
What about the switch side, is it set to auto ?

On our gige nics we set both sides to auto (server and switch) and that seems to work.
 
never used this card but...

bash-3.00# ndd -set /dev/rge0 adv_autoneg_cap 0
operation failed: Permission denied

guessing you can't use ndd to set autoneg to false; does this card come with some software, some maintenance commands? I suggest to use them. ndd will "just" try to change the running driver, there might be some config file for the driver /kernel/drv/rge*conf

Best Regards, Franz
--
System Manager (Solaris, HP-UX, Linux, some networking, some SAN)
 
Well,

Somehow yesterday it worked on 1000 but today after a reboot its just back to 10 half duplex.

To answer the question on the switch, Switch is set to autosense.

and to answer the other question there is no rge0.conf in that directory.

This card was detected by the installation and I really did not had to do anything. I configure it during install.

I mean, I don't think placing the a file under /kernel/drv/rge0.conf is going to help. I already tried this
and it did not worked. I placed these commands on it.

ndd -set /dev/rge0 adv_100fdx_cap 0
ndd -set /dev/rge0 adv_10fdx_cap 0
ndd -set /dev/rge0 adv_10hdx_cap 0
ndd -set /dev/rge0 adv_100hdx_cap 0
ndd -set /dev/rge0 adv_1000fdx_cap 1


So now I dont understand how did it did work and now it does not. I have tried different switches on auto and no luck.

What really bothers me is that at one point yesterday it did worked after I upgraded OBP from 3.23 to 3.31.

No idea if that had anything to do with it.

I am going to try again by placing that file or even placing it under a script in rc3.d

If anyone has any ideas let me know by all means. I am willing to try anything at this point. Its a fresh install so the worst it can happen is that I have to install it gain.



Thanks in advance.


PopKorn

 
well, if ndd does not work from the root shell it won't work in the rc2.d directory...
Did you install the latest recommended patches cluster for that OS? If not please do so.
Do you have a servicecontract with Sun? I suggest to open a Call, it might be a known problem which is not a public information (intinfodoc)

Best Regards, Franz
--
System Manager (Solaris, HP-UX, Linux, some networking, some SAN)
 
Yes, I did install the latest cluster. I just noticed I did something wrong. The Gig card was placed in the wrong slot. Its a 33/3.3v card and I had it on a 64bit 5 volt slot.

I am lucky to have another card so I switched the fibre channel card up one slot and the Gige down one and everything is were it should be.

Human error on this one due to lack of sleep.

Now, I do run into another problem.


After changing the slot on the card and I plumb it. Its no longer rge0, its now rge1. Never the less, the system is getting panics and dumps. I deleted hostname.rge0 and hostname rge1. I removed the soft links on /dev/rge* and I performed a sys-unconfig since this is a fresh install I did not see a problem with it.

How ever, I still get critical alerts and panics in the pcibus. All these started as soon as I changed the cards.

I know solaris is not very fanatic of hardware changes, so I guess A full reinstall will cure this one and hopefully everything will be in order.


I will post back my results in a few hours.


PopKorn
 
Full Reinstall done and still having the same problem,

looks like its a hardware problem.

PopKorn
 
SUNW-MSG-ID: SUNOS-8000-0G, TYPE: Error, VER: 1, SEVERITY: Major
EVENT-TIME: 0x45f40ed1.0x989662e (0x165f93e621c)
PLATFORM: SUNW,Ultra-80, CSN: -, HOSTNAME: solaris
SOURCE: SunOS, REV: 5.10 Generic_118833-33
DESC: Errors have been detected that require a reboot to ensure system
integrity. See for more information.
AUTO-RESPONSE: Solaris will attempt to save and diagnose the error telemetry
IMPACT: The system will sync files, save a crash dump if needed, and reboot
REC-ACTION: Save the error summary below in case telemetry cannot be saved

ereport.io.pci.dpe ena=165f93acbd900001 detector=[ version=0 scheme="dev"
device-path="/pci@1f,2000" ] pci-status=c2a0 pci-command=146 pci-pa=0

ereport.io.pci.sserr ena=165f93acbd900001 detector=[ version=0 scheme="dev"
device-path="/pci@1f,2000" ] pci-status=c2a0 pci-command=146 pci-pa=0

ereport.io.pci.rserr ena=165f93acbd900001 detector=[ version=0 scheme="dev"
device-path="/pci@1f,2000" ] pci-status=c2a0 pci-command=146 pci-pa=0


panic[cpu0]/thread=2a10001fcc0: pcipsy-1: Fatal PCI bus error(s)


000002a10007fe70 pcipsy:pbm_error_intr+158 (300010e2300, 1298000, 300000dfce8, 300000dfce8, 0, 300023cfb80)
%l0-3: 0000000000000001 0000000000000000 00000000018d1800 00000000018d1800
%l4-7: 0000000000000001 00000000018d1800 00000300000ef838 0000000000000001
000002a10007ff50 unix:current_thread+170 (0, 0, 0, 2a10004fcc0, 30000389ab8, 8)
%l0-3: 00000000010076e4 000002a10001f021 000000000000000e 00000000000007f0
%l4-7: ffffffffffffffff 0000030001bf5ae8 000000000000000b 000002a10001f8d0
000002a10001f970 unix:disp_getwork+38 (180c000, 1843dd8, 30001a20000, 1b, 0, 1813400)
%l0-3: 0000004400001607 ffffffffffffffff 0000000000000000 0000030000389a80
%l4-7: 0000030000389a80 0000000000000000 0000000000000000 0000000000000000
000002a10001fa20 unix:idle+d4 (1813400, 4, 180c000, ffffffffffffffff, 1, 1812000)
%l0-3: 000000000183d7e8 000000000000001b 0000000000000000 ffffffffffffffff
%l4-7: 000000000183d818 0000000000000000 0000000000000006 0000000001053b98

syncing file systems... done
dumping to /dev/dsk/c0t0d0s1, offset 107741184, content: kernel
100% done: 16225 pages dumped, compression ratio 2.86, dump succeeded
rebooting...
Resetting ...

---------------------------------------------------------

SUNW-MSG-ID: PCI-8000-5X, TYPE: Fault, VER: 1, SEVERITY: Critical
EVENT-TIME: Sun Mar 11 10:18:45 EDT 2007
PLATFORM: SUNW,Ultra-80, CSN: -, HOSTNAME: solaris
SOURCE: eft, REV: 1.16
EVENT-ID: b8ac7c8f-7dc5-4849-b903-bc7d2ee15f65
DESC: A problem was detected in the PCI subsystem. Refer to for more information.
AUTO-RESPONSE: One or more device instances may be disabled
IMPACT: Loss of services provided by the device instances associated with this fault
REC-ACTION: Schedule a repair procedure to replace the affected device. Use fmdump -v -u <EVENT_ID> to identify the device or contact Sun for support.



----------------------------------------------------------

bash-3.00# fmdump -v -u b8ac7c8f-7dc5-4849-b903-bc7d2ee15f65
TIME UUID SUNW-MSG-ID
Mar 11 10:18:45.6281 b8ac7c8f-7dc5-4849-b903-bc7d2ee15f65 PCI-8000-5X
33% fault.io.pci.bus

Problem in: hc:///motherboard=0/hostbridge=0/pcibus=0
Affects: hc:///motherboard=0/hostbridge=0/pcibus=0
FRU: hc:///motherboard=0

67% fault.io.pci.device

Problem in: hc:///motherboard=0/hostbridge=0/pcibus=0/pcidev=1/pcifn=0
Affects: dev:////pci@1f,2000/ethernet@1
FRU: hc:///component=PCI 1 66
--------------------------------------------------------



I guess there is either a conflict with the realtek card or solaris dont like the card or there is hardware issue. Great.

If anyone has any insight on this it would be much appreciated.

A fresh install should not do this so it kind of points to hardware but I dont even know were to start.

I already moved it, reseat it. I hope its not the motherboard.

PopKorn
 
Guys, thank for your help.

This is looking more like a hardware problem. I will open another thread because this one is irrelevant to my problem right now. Never the less, thanks for all that commented.


PopK0rn
 
Update.

If you own a sparc machine and you would like to use gbit ethernet. Please by all means stay away from the Hawkings HGA32T, this card is not compatible at all on sun hardware even though it says it is on HCL on Sun website.

This card uses Realtek RTL8169-S32 chipset, the problem is not the chip, I can tell you that much because I purchased a NETGEAR G311 and this card has the same chipset and its currently working with no issues. I did not even had to use the Gani drivers. The default rge drivers work fine.

bash-3.00# dladm show-dev
hme0 link: unknown speed: 0 Mbps duplex: unknown
rge0 link: up speed: 1000 Mbps duplex: full


This is for informational purpose only and I am only expressing the fact that the Hawking card did not worked on my ultra80 and that the NETGEAR did. I am not trying to degrade any company in any way. It is only my personal opinion.


PS. The card working right now is the NETGEAR.


P0Pk0rn
 
great feedback! Thanks P0Pk0rn. This is how boards work: Question - Answers - Feedback/Solution!

Best Regards, Franz
--
System Manager (Solaris, HP-UX, Linux, some networking, some SAN)
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top