Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations SkipVought on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

ES45 - HSG80 - OpenVMS 7.3-2 - system crashed 1

Status
Not open for further replies.

CaptNeo

Technical User
Jun 4, 2002
49
0
0
US
We have an ES45 systems that has it's OS mounted in one of the drives attached to an HSG80 controller. The following morning, we were stun to learn that the system has crashed and the LED display says AC power loss. I've tried replacing the power supply but it wouldn't power up and would just say 'PS0 failed' after pushing the power on button. I've managed to generate the error log from another system by mounting the OS drive of the ES45 and below is the only error I saw that had a device error. The last entry of this error was around 3PM but according to the operator log file, the system was still up around 7PM. The FMU on the HSG80 doesn't have any hardware fault since the system crashed Nov. 3 and the last entry was back on Sep.
Any help?

Description: VMS Device Error Event at Wed 3 Nov 2004 15:48:09 GMT-07:00 from L
AXE45
File: ./es45_errlog.sys
================================================================================

COMMON EVENT HEADER (CEH) V2.0
Event_Leader xFFFF FFFE
Header_Length 284
Event_Length 720
Header_Rev_Major 2
Header_Rev_Minor 1
OS_Type 2 -- OpenVMS AXP
Hardware_Arch 4 -- Alpha
CEH_Vendor_ID 3,564 -- Hewlett-Packard Company
Hdwr_Sys_Type 38 -- Titan Corelogic
Logging_CPU 0 -- CPU Logging this Event
CPUs_In_Active_Set 1
Major_Class 1
Minor_Class 1
Entry_Type 1,001 -- VMS Device Error Event
DSR_Msg_Num 2,017 -- AlphaServer ES45
.... Model 2/2B
.... CPU Slots: 1 (1250 Mhz)
.... PCI Slots: 10
.... MMB Slots: 8 (DIMMs)
Chip_Type 12 -- EV68CB - 21264C
CEH_Device 54
CEH_Device_ID_0 x0000 0000
CEH_Device_ID_1 x0000 0000
CEH_Device_ID_2 x0000 0000
Unique_ID_Count 2,911
Unique_ID_Prefix 1
Exact_Length 422
Num_Strings 6

TLV Section of CEH
TLV_DSR_String AlphaServer ES45 Model 2B
TLV_DDR_String HSG80
TLV_Sys_Serial_Num SW04400025KC
TLV_Time_as_Local Wed 3 Nov 2004 15:48:09 GMT-07:00
TLV_OS_Version V7.3-2
TLV_Computer_Name LAXE45
Entry_Type 1,001

EMB_Block
emb_ertcnt 16 Error Count
emb_ertmax 16 Max error count
emb_iosb 0
emb_sts x1802 0110
emb_class 1 Disk Class
emb_type 54
emb_rqpid 65,845
emb_boff 0
emb_bcnt 0
emb_media 167,772,161
emb_unit 40
emb_errcnt 1
emb_opcnt 0
emb_ownuic x0001 0004
emb_char x1CC5 4008
emb_slave 0
emb_func 8
emb_name_len 6
emb_name $1$DGA
emb_dtname_len 5
emb_dtname HSG80

Generic DK Driver Header Revision 3
DK_Longword_Count 49
DK_Error_Type x05 Extended Sense Data from Device
DK3_SCSI_ID x0000 0000 0000 0003
DK3_SCSI_LUN x0000 0000 0000 0400
DK_Port_Status x0000 0001
DK3_SCSI_CMD_Len 6
SCSI_Status 2 Check Condition
DK3_Additional_Data_Len 160

Last Failure Event Sense Data Response - 01
HS_Error_Code x70
HSZ_Sense_Key x06
Sense_Key[3:0] x6 Unit Attention
Additional_Sense_Len 152 Std Length for HSZ/HSGxx
ASCQ_ASC x00A0
ASC[7:0] xA0 Vendor Specific ASC
ASCQ[15:8] x0
HS_Instance_Code x0102 030A Intentional Restart or Unrecovera
ble Firmware Inconsistency
Template_Type x01 Last Failure Event
Template_Flags x20
Reserved_1_1 x0000
Reserved_1_2 x0000
Reserved_1_3 x0000
Reserved_1_4 x0000
Reserved_1_5 x0000
Reserved_1_6 x0000
Reserved_1_7 x0000
Display_Message_1 0
Ctrl_Serialnum ZG93513777
Ctlr_Firmware_Rev V87F
LUN_Status x00
Last_Fail_Code x2009 0010
Last_Failure_Parameter0x0000 0000
Last_Failure_Parameter1x0000 0000
Last_Failure_Parameter2x0000 0000
Last_Failure_Parameter3x0000 0000
Last_Failure_Parameter4x0000 0000
Last_Failure_Parameter5x0000 0000
Last_Failure_Parameter6x0000 0000
Last_Failure_Parameter7x0000 0000
 
Wed 3 Nov 2004 15:48:09 GMT-07:00 => 08:48 (could be one of the inconsistencies re the logs.

the drives are spinning up correctly on another box
led ->a/c power loss - 2 PSU's

I'm clutching, but could there be a short, causing the reset switch (if there is one), or power switch to ground off the chassis reset, or power off the machine

Code:
  Intentional Restart or Unrecoverable Firmware Inconsistency

As i said, just clutching
HTH
--Paul

Nancy Griffith - songstress extraordinaire,
and composer of the snipers anthem "From a distance ...
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top