It sounds like the daughterboard is corrupted. I would try and get into "pdt". If you see errors such as unable to access the C drive or ERROR - Unable to create Access File. etc, then the board would be corrupted.
Doing a dump gives TEMU0131 :
(TEMU0131 Problem setting previous database file's information.)
>LD 43
EDD000
.EDD
DB SEQ NUM = 2572
CONFIG
EDD007
.
.
EDD000
TEMU131 Problem setting previous database files information.
CIOD157 INFO: CMDU 0 is ACTIVE, RDUN is ENABLED
Entering pdt gives access error and shows there's no file structure in the/u directory
------------------------------------------------------------------------------------------------------------------------
PDT: login on /sdi/tty1
Password:
PDT in Progress. Please Wait....
** ERROR - Unable to create Access File.
pdt>
pdt> cd /u
pdt> ll
Directory of '/u':
SIZE DATE TIME NAME
---------- ----------- -------- ------------
512 Jul-30-2008 12:28:58 SMP_DB <DIR>
pdt> cd ..
Suggested a re-install will need to be done. However, the following may help below.
Dump failure EDD007
The EDD failed at the ALARM_MGT point in the Dump
Associated errors point to problems with the files in the smp_db directory as below :
Extract from the EDD
-------------------------------
PLUGIN
CPND
CPND NM
GPHT
SPECIFIC DATA
HI
ALARM_MGT
EDD007
BUG9011 Can't create/open Alarm Management Database file /u/smp_db/smpconf.tmp.
BUG9011 Can't create/open Alarm Management Database file /u/smp_db/smpserv.tmp.
BUG9008 Error occurred writing Alarm Management Database /u/smp_db.
Navigating to the smp_db directory in pdt and listing its contents showed errors in the directory structure.
pdt> cd smp_db
pdt> ll
Directory of '/u/smp_db':
SIZE DATE TIME NAME
---------- ----------- -------- ------------
pdt>
This is what the directory structure should look like :
pdt> ll
Directory of '/u/smp_db':
SIZE DATE TIME NAME
---------- ----------- -------- ------------
512 Aug-14-2007 15:50:48 . <DIR>
512 Aug-14-2007 15:50:48 .. <DIR>
pdt>
To overcome the problem I deleted the smp_db directory and reprovided it as below :
The smp_db directory is off the 'u' directory
Directory of '/u' showing the smp_db dir
SIZE DATE TIME NAME
---------- ----------- -------- ------------
67 Nov-30-2027 00:03:22 COPYLOOP.DAT
512 Nov-30-2027 00:01:18 LOADWARE <DIR>
1114 Mar-08-2006 19:06:14 KEYCODE
512 Aug-13-1998 15:15:04 DB <DIR>
512 Aug-13-1998 15:15:04 RPT <DIR>
512 Aug-13-1998 15:15:04 PATCH <DIR>
512 Aug-13-1998 15:15:04 SMP_DB <DIR>
pdt>
So in the 'u' directory and using the commands you can re-created the smp_db dir.
'rmdir smp_db' to remove the dir
'mkdir smp_db' to reprovide the directory.
After this the directory structure looked correct :
pdt> cd /u/smp_db
pdt> ll
Directory of '/u/smp_db':
SIZE DATE TIME NAME
---------- ----------- -------- ------------
512 Aug-14-2007 15:50:48 . <DIR>
512 Aug-14-2007 15:50:48 .. <DIR>
After this pdt change doing an EDD allowed access to the smp_db dir for read/write functions and the Dump completed.
Also look at this below.
Dump fails at the same point with an EDD007:
CPND
GPHT
SPECIFIC DATA
HI
EDD00007
There were no hardware information files within the /U/DB/HI directory and we were unable to copy files to it. Once the HI directory was removed and re-provided we could then copy the relevant files from /P/HIDIR to /U/DB/HI.Data dump was then successful.
Also found this below that requires a patch to be fitted at 22.46 and 23.47.
Patch 11848 fixes a problem where EDD007 is out put during the midnight dump.
Error Description
CONDITION:
a) Option 11c running 24.04f, daughterboard NTDK81 is equipped
b) Patch MPLR11502 bv82247 is in service
c) Ld 43 in midnight routine.
ACTION:
1) Dump is performed during midnight routine
EXPECTED RESPONSE:
1) Dump is performed with no problems
ACTUAL RESPONSE:
1) EDD 007 is printed, no dump because Flash write failure
DEFECT CAUSE:
The c: drive block driver will become unstable under heavy load often seen during MIDN EDD's. Various error messages are printed to the console, however the bug message BUG 6347 'Logical block mapping is invalid' is the most interesting.
The workings of the c: drive are complex so for the rest of this explanation an understanding of the following things are assumed:
-properties of Flash EEPROM
-the IO subsystem as implemented by VxWorks
-the Dos File System as it is implemented by VxWorks
-the c: drive block driver (ssDrv) and it's interface to the system and inner workings/mechanisms
-the basic theories behind synchronization of concurrent access to shared resources
Symptom one:
Under high load the c: drive will obviously take longer to service read and write requests than it does under normal load. Under high load the packer needs to run almost constantly to keep the free track pool within threshold values which in turn increases the wait time for lower priority processes such as the tRpt task (for example).
When the tRpt task times-out waiting on the admin semaphore it will try to 'un-stuck' it. The problem arose
when the low priority task enters the ssdrvStuckSemaphore function the high priority task may have already given back the semaphore (that's what the packer does to relieve hogging) it (tRpt) will get a 0x0 back for the task which will pass it's criteria and then semMGiveForce the admin semaphore!!! This isn't a big problem unless another high priority task grabs the semaphore again and tRpt won't know any different!!! Under these circumstances two tasks could then be working within their critical regions and messing up the logical block map and any other really important data structures!!!!
Symptom two:
The packer process would panic and call packtrack with threshold one if it did not reclaim a track right away. This would essentially 'hose' the system (to use the most technical term possible) leading to a higher
probability to see 'Symptom one'. One can now easily see that this would cause a cascading effect where the entire system would become really really 'hosed'.
Symptom three:
The mechanism to recover erasable tracks that were not recovered by an erase task for reasons of timeouts, BERR or whatever was preventing the packtrack function from being able to actually pack tracks as it would attempt to erase the same track over and over until the actual erase was complete. This resulted (in high load situations) in allowing the drive to go into dangerous low levels of free tracks often resulting in a 'no free tracks' bug.
Other minor errors existed that are both not interesting and not worth mentioning.
SOLUTION:
Symptom one:
The ssdrvStuckSemaphore function was modified to do a taskLock before checking anything to ensure that the snapshot it takes to do it's checking is the same when it actually performs the check! As well an additional check is performed to not let task id's of 0x0 get through...
Symptom two:
This was changed to take a more gentle approach to the situation where it would use a basic linear equation of sorts to adjust the threshold value being used to call packtrack. The limits used for low free tracks and high free tracks were used slightly differently to also make the packer less aggressive.
Symptom three:
To prevent this extra checks were put into place to check if an erase was in progress using the per-chip
state variable. This variable wasn't always reliable because it was updated before the update to the
accounting records was complete resulting in the same phenomenon from above. Therefore some changes to where the state was updated was done to create a more accurate picture of what was happening for the erase.
I think it's wise to obtain as many listings as possible such as DNB.TNB etc. I have done a listing script file for use with Procomm Plus that would help.
There is another thing that might help below.
Patch 11199.........22.46 and 23.47.
Error messages during Datadump (EDD007), no access to PDT, no access to Overlays 117 and 135. A manual INI solves the problem. In marginal condition, it is also reported that the problem can only be solved by a Reload.
Memory leakage problem: It is possible for PDT to lose 8K of memory for every invalid login if an interactive shell is used. If a user types the incorrect password twice and the user is denied entry to PDT the system will lose memory. This is one of many scenarios that can result in a memory leak within the cpsPdtShell in pdtShell.c. The memory leak comes from the ledOpen routine which allocates a history of size 20 * 408 = 8K There are places which return without doing a ledClose ( Which frees up the history).
Patches 11199,11843,11848,11965.......22.16 22.46 23.47.
A new occurrence of a C-Drive (Software Daughter Board) lockup problem has been recently experienced in two different European countries. When the problem occurs the C-Drive is no longer available for use. The Flashrom where the C-Drive is located must be erased and the software reinstalled. Contact the MOC for help in doing this. As a workaround solution, the daughterboard (NTSK06) can be replaced by a new one, then the Meridian Software reinstalled via PCMCIA.
Bad Patch: Feedback from the field was that a problem started to occur with deployment of patch mplr11140 . Investigation from Technology group confirmed that this patch might have to be improved. The problem manifests itself in the form of:
- Error message EDD007 during Datadump
- No access to PDT
- No access to overlays 117 and 135
- BUG 6347, BUG 6351, BUG 6348, BUG 6363, BUG 6364
- Memory leakage (You can lose 8K of memory every time an invalid password is input when trying to access pdt. There are other scenarios that can result in a memory leak within the cpsPdtShell in pdtShell.c)
- Z-drive backup errors
- System freeze
Affected sites
Option 11C running Rls 22 or Rls 23 market Releases, and Rls 24 pre-market Release.
Preventative action
Install the following suite of patches:
mplr11199 (Cures Memory Leak)
mplr11843 (System freeze due to amd flash driver's blocking under certain hardware failure conditions).
mplr11848 (Various C: Drive errors mostly seen as BUG 6347. Affects entire system, reported during datadumps, when the datadump fails you will see EDD 007. Replaces patch mplr11502)
mplr11965 (Cures the Z drive problems)
All the best
Firebird Scrambler
Meridian 1 / Succession and BCM / Norstar Programmer in the UK
If it's working, then leave it alone!.