We recently had a server go down and after working with Veritas and IBM they determined that the crash was caused by console process/output getting locked by someone typing Ctrl+S at the console. Since veritas writes output to the console and couldn't, it eventually resulted in the "had" process looping and not being able to status the heartbeat in the cluster. This then caused Veritas to Panic the server and it dropped.
Has anyone else ever seen something like this happen?
Or, does anyone have any creative ideas how to either call BS on IBM/Veritas for this RCA or prevent this from happening again?
Thanks....
Has anyone else ever seen something like this happen?
Or, does anyone have any creative ideas how to either call BS on IBM/Veritas for this RCA or prevent this from happening again?
Thanks....