All-
I am running a compute farm managed by LSF. Most of the servers are Linux boxes running RHEL3 and the master is a Sun-Fire-V240 running SunOS 5.8. One server has been upgraded to RHEL4. And guess which one is having trouble. On that upgraded one, LSF jobs start fine and run to completion, but never fully terminate. These are being enqueued from a desktop running RHEL3. From my desktop, also running RHEL3, I find I can ssh my way to the upgraded server and run a job, which again runs to completion and then when I attempt to leave the ssh session with exit, it just never gets to it; I have to close the window out from under ssh to end the session. I suspect the problems are related, and fixing one will probably fix both. Sounds like a socket not being closed or something like that. Anyone have any ideas where to look for this one?
Thanks!!!
Wilville
I am running a compute farm managed by LSF. Most of the servers are Linux boxes running RHEL3 and the master is a Sun-Fire-V240 running SunOS 5.8. One server has been upgraded to RHEL4. And guess which one is having trouble. On that upgraded one, LSF jobs start fine and run to completion, but never fully terminate. These are being enqueued from a desktop running RHEL3. From my desktop, also running RHEL3, I find I can ssh my way to the upgraded server and run a job, which again runs to completion and then when I attempt to leave the ssh session with exit, it just never gets to it; I have to close the window out from under ssh to end the session. I suspect the problems are related, and fixing one will probably fix both. Sounds like a socket not being closed or something like that. Anyone have any ideas where to look for this one?
Thanks!!!
Wilville