Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

ARP/RARP Timeout during Jumpstart

Status
Not open for further replies.

kHz

MIS
Dec 6, 2004
1,359
US
Trying to Jumpstart a Sun Blade 100 that is in the same subnet as the boot/install server (the BIS is the same machine), but I constantly get a "Timeout waiting for ARP/RARP packet" and after a minute or two it then repeats "Retrying ... Check TFTP server and network setup" .

The netmask is 255.255.255.0 (both server and client). The config files are in /jumpstart and the image is /export/install/Solaris_9_904_sparc and my add install is:
./add_install_client -s SERVER:/export/install/Solaris_9_904_sparc -c SERVER:/jumpstart -p SERVER:/jumpstart CLIENT sun4u
The BIS server is Solaris 10 3/05 and rarpd and nfsd, statd, mountd, lockd are running. tftp is uncommented in /etc/inet/inetd.conf and inetd has been restarted. On the client I have ran watch-net, watch-net-all, and test-net which all are successful. /etc/ethters and /etc/hosts have the correct entries for the client. All of the files are in /tftpboot and I even installed a boot server in /export/install/sun4u and used '-t /export/install/sun4u/Solaris_9/Tools/Boot' in the add_install_client command line. Shares are /jumpstart and /export/install and /export/install is a filesystem.

Anything I am overlooking? I have never ran into a problem before with Jumpstart and ARP timeouts that I haven't been able to quickly fix, until now.

Thanks.
 
Since the Sun Blade hasn't been configured with network information, it need to get that infomation from the network. You should have a boot server that contains the following files:
/etc/ethers - For resolving Ethernet Address to Node Name
/etc/inet/hosts - For resolving Node Name to IP Address
/etc/bootparams - Provide boot, identification, configuration and installation services.
/tftpboot - And have tftp uncommented in /etc/inetd.conf
/etc/dfs/dfstab - For shared directory information

 
khz;

was this a working jumpstart server or new?

on you BIS do you have both /export/install and /jumpstart shared in /etc/dfs/dfstab?

usually when I get arp/rarp it is do to incorrect entry of mac address in /etc/ethers.

As a last resort I have removed /etc/bootparams and cd /tftpboot and rm * all files. Then I would rerun the add_install_client etc etc.

Thanks

CA
 
bfitzmai -
I have the /etc/ethers ; /etc/inet/hosts ; /etc/bootparams ; /etc/dfs/dfstab ; /tftpboot that are populated with the correct information but my client just gets an ARP/RARP timeout.

cndcadams -
It is a new jumpstart server.
The BIS server has both /export/install and /jumpstart in /etc/dfs/dfstab and they are shared.
I tried to do a rm_install_client on the ethernet address of the client machine that is in /etc/ethers and it returned with an error saying it couldn't remove it because it couldn't find 'rm.xx.xx.xx.100' in /tftpboot which isn't the address of the client which is 'xx.xx.xx.120' so I am not sure where it is getting that information. And the xx.xx.xx.100 address is the IP of a Windows internal nameserver.

When the client does a boot net - install, how does it identify the BIS? How does the client get SERVER-A as the boot server when there are dozens or hundreds of other machines on the same subnet. I realize that not all of them are Solaris machines, but it has to identify the machine it will use as the BIS?

Thanks
 
KHZ;

That is where the arp/rarp comes in. So what you have is your server has the mac address of the client and an ip address which you assigned on your server. When you boot net - install on the client it sends an ARP (address resolution protocol) request saying is there anyone out there that has an address to go with my mac id. The server responds with the RARP (reverse address resolution protocol answer saying yes I have an ip for that mac id and here it is. Do what I said and cd /tftpboot and run rm *. Then cd /etc and rm bootparams.

When you run add_install_client it will create the entry in tftpboot and bootparams file. Then try your boot net - install again.

I am sure you have already checked but have to mention make sure there is no other system on the network with the ip you are assigning to the client.

What you could try to do if you have another hub just connect the sun blade and your server to it and troubleshoot it that way. This will rule out the network.

Keep me updated.

Thanks

CA
 
If I'm having problems like this (and I have had on many occasions!) I usually snoop the network interface on the JS server with an appropriate filter expression to make sure that it is seeing the ARP packets are coming in and sending an appropriate response.

Has the server you are (re?)building been on the network recently with that other IP address? Perhaps flushing the ARP cache would help. You can also run in.rarpd manually with the -d switch in a separate terminal sessoin to get debugging info. Actually... I would try that before the snoop.

Annihilannic.
 
just another idea: I have seen JS Problems long time ago, some RARP requests "who is 8:0:20:d9:ce:b0" get replies like "Unknown".
Sun suggested the following workaround: create a JS physical Subnet, ie. JS-Server has another physical interface and Clients are installed via this NIC ether connected with a X-Link or a HUB (since some switches filter broadcasts you better not use them)

The best way to debug things like that is, what Annihilannic suggests: snoop the interface, what's going on

Best Regards, Franz
--
Solaris System Manager from Munich, Germany
I used to work for Sun Microsystems Support (EMEA) for 5 years in the domain of the OS, Backup and Storage
 
KHZ;

I also agree with Annihilannic and daFranze that snoop is a good tool to see if there is communication going on between your server and client.

Also did you run check after editing rules?

Also take a look at /etc/bootparams file and look at the entry for your sun blade to see if it is looking at the correct files.

Thanks

CA
 
Here is the sequence of events for the client RARP request...

1. Client sends a RARP for its IP address

2. The Boot Server responds via RARPD (in.rarpd) with the IP address in /etc/ethers or the ethers NIS/NIS+ map depending on the ethers setting in /etc/nsswitch.conf

3. The client sends a tftp request for a bootimage

4. The server starts in.tftp from inetd and sends the small net kernel image

5. The client then sends out a bootp request

6. The server responds with the clients entry from /etc/bootparams

7. The client NFS mount it’s root partition from the install server

8. The client then mounts the configuration server (/jumpstart) and runs “sysidtool”.

9. It then mounts the install image and runs Suninstall to begin the install process.

KHz - Take a look at the /etc/nsswitch.conf and verify the ethers and hosts entry has "files" as the first argument.
 
I would remove your entry in /etc/bootparams, also double check your /etc/ethers and /etc/hosts for duplicate addresses. I know sometimes I have to run /etc/init.d/nfs.server stop && /etc/init.d/nfs.server start (Solaris 8). Also if you are using NIS or NIS+ and use netgroups to mount up file systems you will need to add the machine there.
 
Thanks for all of the help. I removed /etc/bootparams and /tftpboot/* and then reran the add_install_client but still am getting the ARP/RARP timeout. The JS server has never been on the network with the other IP.

I am going to try snoop and see what traffic is or isn't happening between the server and client.

Well, before the snoop I tried 'in.rarpd -d eri 0' and this is the output:
in.rarpd:[1] device eri0 11address xx:xx:xx:xx:xx (server MAC)
in.rarpd:[1] device eri0 address 192.168.1.200 (server IP)
in.rarpd:[1] device eri0 subnet mask 255.255.255.0
in.rarpd:[3] starting rarp service on device eri0 address xx:xx:xx:xx:xx:xx (server MAC)
in.rarpd:[3] RARP_REQUEST for xx:xx:xx:xx:xx:xx (client MAC)
in.rarpd:[3] trying physical netnum 192.168.1.0 mask ffffff00
in.rarpd:[3] good lookup, maps to 192.168.1.205 (client IP)
in.rarpd:[3] immediate reply sent

then I get the timeout and check tftp server messages on the client.

Any ideas on the output from in.rarpd?

Thanks!
 
Out of curiosity, how many ARP/RARP timeout messages do you get? Just a couple, or do they come up continuously?

Ah, I see from your original post you just get a couple... in my experience that's quite normal, I think we're barking up the wrong tree. It sounds like it's getting an IP (you can test by pinging it) but is unable to download the kernel to boot, so I'd be checking the TFTP side of things?

Annihilannic.
 
I have /export/install and /jumpstart as shared filesystems and in my add_install_client I use 'add_install_client -e xx:xx:xx:xx:xx:xx -s 192.168.1.200:/export/install -c 192.168.1.200:/jumpstart -p 192.168.1.200:/jumpstart CLIENT sun4u' and /etc/bootparams is this:
#cat /etc/bootparams
CLIENT root=192.168.1.200:/export/install/Solaris_9_904_sparc/Solaris_9/Tools/Boot install=192.168.1.200:/export/install boottype=:in sysid_config=192.168.1.200:/jumpstart install_config=192.168.1.200:/jumpstart rootopts=:rsize=32768

Should the add_install_client for -s contain /export/install/Solaris_9_904_sparc or /export/install? What else could I check for tftp?

Thanks.
 
What if vi /etc/bootparams (yy,P) and comment out the extra line. Then change the install=192.168.1.200:/export/install line, to read install=192.168.1.200:/jumpstart?
 
I reran add_install_client with only -c and -p which changed the line to install=/export/install/Solaris_9_904_sparc

I ran a snoop on the interface and I am getting a read for tftp but it is followed by: ICMP Destination unreachable (UDP port 69 unreachable)

/etc/services has 69/udp for tftp and I have an entry in /etc/inet/inetd.conf for tftp that is uncommented.

So it appears this may be where I am getting hungup, but I can't figure out why.

Any suggestions?

Thanks
 
Ah!

I did a man on tftpd and at the very end of the man page it says that in.tftpd is managed by the service management facility under the identifier:

svc:/network/tftp/udp6:default

However, this does not exist on my server. Anyone know how to create an populate an identifier on Solaris 10?

Thanks
 
I ran inetconv and that created my smf tftp service.

Thanks for everyone's help!!
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top