Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Chris Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Jobs not ending under RHEL4

Status
Not open for further replies.

wilville

MIS
Apr 8, 2005
50
US
All-

I am running a compute farm managed by LSF. Most of the servers are Linux boxes running RHEL3 and the master is a Sun-Fire-V240 running SunOS 5.8. One server has been upgraded to RHEL4. And guess which one is having trouble. On that upgraded one, LSF jobs start fine and run to completion, but never fully terminate. These are being enqueued from a desktop running RHEL3. From my desktop, also running RHEL3, I find I can ssh my way to the upgraded server and run a job, which again runs to completion and then when I attempt to leave the ssh session with exit, it just never gets to it; I have to close the window out from under ssh to end the session. I suspect the problems are related, and fixing one will probably fix both. Sounds like a socket not being closed or something like that. Anyone have any ideas where to look for this one?

Thanks!!!
Wilville
 
I'm not sure about LFS, but for the ssh symptoms it sounds like a process is holding your TTY open, which prevents it from terminating the connection cleanly. When your test job has finished, try doing a ps -ft `tty` and/or lsof `tty` to see if any processes apart from your shell are still running against it.

Annihilannic.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top