Sorry; I was slightly off - here's a description of the switchover states:
Graceful Switchover
In normal operation the health count of each CPU should be equivalent. In the case where the active CPU detects that the redundant CPU has better health, a graceful switchover is invoked. In this process, almost the entire memory image from the active CPU is copied over to the memory of the redundant CPU. The redundant CPU resumes the operations left off from the active CPU after going through a post-switchover procedure. This post-switchover procedure includes sending out a gratuitous ARP message to the IP world for informing where the active IP ELAN address is located. This CPU becomes the active side.
The previously active side invokes a warm start after the copying operation is completed. After the warm start, it becomes the redundant side.
During a graceful switchover, there is usually no impact to calls already in progress. There is a brief duration whereby new calls are not allowed in the neighborhood of 6-8 seconds depending upon the configuration.
Graceful switchover may be invoked manually using the SCPU command in overlay 135.
Ungraceful Switchover
When it is decided that the active side is inoperable (e.g. power or processor failure, watchdog timeout, exceptions), the **redundant side warm starts** and takes over control. The switchover does not occur immediately, because when the redundant side detects loss of heartbeat, it must wait long enough to be sure that the active side is not simply performing a warm start (INI). The timer used to invoke the ungraceful switchover is in the order of 56 seconds.
Heartbeat
The two CPUs exchange heartbeats to determine if the other CPU is reachable over the HSP. The heartbeat protocol also carries information regarding the health count of each CPU. If the HSP is disconnected then the heartbeat protocol attempts to traverse the ELAN instead
If the heartbeat cannot be communicated between the two CPUs meaning that connection over the HSP and ELAN is lost between the two CPUs then the redundant CPU warm starts to become active after a certain period of time.
By optimizing timeout and threshold parameters used in retries of the heartbeat mechanism, ungraceful switchover trigger time is reduced to less than 15 seconds. The optimization in the timing leads to a change in the INI policy. When the active core warm starts, the inactive core also reboots, so no swapping of the cores takes place.
So by unplugging the ELAN, the health changed, and you got a "graceful" switchover - 6-8 seconds. The heartbeat was still being carried over the HSP. When you powered down the active call server, you got a ungraceful switchover which takes up to 56 seconds (so the docs say) and also invokes an INI on the offline side. I'd guess if it sysloaded also, then something was wrong with the offline side. Make sure both CPU's are patched (patch it in redundant mode) and then test again; make sure you can boot off both cores and run off both cores w/o error.
This is all from the System Redundancy NTP's in the Campus Redundancy section - the description is the same as non-campus redundant HA configuration.
Matthew - Technical Support Engineer Sr.