Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

IP Office 500v2 Random Reboots 4

Status
Not open for further replies.

cyric297

IS-IT--Management
Aug 18, 2015
11
US
Hi All,

I have an IP Office 500v2 that randomly reboots from time to time and support so far has been absolutely useless (including our vendor support). To their credit, they have tried to solve the problem, but have been unable to do so as of yet (and we are going on 6 months with this issue).

The Problem: IPO500v2 reboots randomly with the following error:
Reason Abnormal Termination​
Termination cause: <FileSystemSMXFSTask::GetDriveNum: Bad d> addr=f03562bc d=5 pc=f01bf568 stack=f01cc0dc f01bf950 f0dd4be4 f03581d4 f030b200​

IP Office 500v2 -> firmware 8.1 (95)

We've tried: formatting the SD card, replacing the entire IPO hardware

I'm open to all solutions anyone may want to try, as this is very frustrating!
 
Yes sir. We replaced everything and I wish I could tell you if it was the same error now, that caused the reboots on the old system but unfortunately I do not have that information. We also upgraded the firmware per support. They were afraid the SD card was formatted by the wrong system so we even re-formatted the card with the IPO system (or whatever was recommended by support).

Obviously the error seems to be pointed to a "Bad d" which I am assuming means bad disk, but I have no idea how to get more from that error, or how to find it in the logs (as I've searched what I know of them and can not find this error).

 
if you've replaced hardware and done all that are there any external factors that will affect the phone system i.e. is it earthed have you changed where it gets its power from such as a different socket on a completely different ring, does the system have VM pro connected or do you use a second SD card for storage on the IPO?
 
I as well think somehow this is related to power to the phone system, have it tested by an electrician
 
With all respect but you cannot blaim your vendor on this, it is either Avaya to blaim or yourself.
If you raised a case with Avaya they would have solved this within two weeks, three at max. But only if you have a IPOSS contract.
If you don't have a IPOSS contract then your'e to blaim as being to cheap to buy IPOSS.
It is like buying a expensive car without a insurance, sure you get full guarantee for a year from the fabric but if you run into a tree then that is not a system failure is it? You need a insurance to cover that damage.
With IP Office it is the same, any hardware failure is covered under a guarantee but any software failure not unless you use the default programming from factory but you don't do you?
That is what IPOSS covers, a software insurance including free upgrades. It is not free but it is not that expensive as well.
 
@averageidiot, @joe2938 I'll check to see if power was ever changed. It is on UPS power that is line conditioned. I'd doubt that to be the case but to cover all the bases I will make sure that it is either changed or has been changed.

@intrigrant I'm not 100% sure on the contract we had with the vendor or with avaya. I'm 100% sure there WAS some kind of support contract in place, whether or not it was to that level (IPOSS) I'm not sure, as I've just been given charge of this system (however I have known about this system and issue since it began, it was just not my system to support). I will check our level of service and if we do have one of these contracts, I suppose I can get them involved. It'd be nice to go ahead and fix if it is fixable on my end however. I am not aware if any of the programming has changed from the factory settings, is there a way I can check?

Thanks everyone for your help so far.
 
Of course programming has changed, otherwise it wouldn't fit your needs.
I put it a bit black on white, the real situation is always more complex as it comes to software guarantee.
The point is that Avaya doesn't help you without a IPOSS contract.
And 8.1 is not a supported version for Avaya so their first response will be "Upgrade to the latest version of 9.0 or 9.1 and if the problem persists call us back"
If you have a IPOSS it can be done free of charge and maybe just upgrading may solve the problem. (a upgrade license isn't that expensive)
If not and Avaya comes with a solution and it is a software bug then support is free of charge, if it is a configuration problem you get a bill if their spend hours exceed the IPOSS contract.
 
@intrigrant Oh I see, you are calling the programming the configuration. Yes the configuration has changed to suit our needs, wouldn't all systems be that way? My guess is you are right, that if there is a new firmware/upgrade then we'd definitely be doing that first. The issue here I don't believe is cost, but whether or not things will work. They've already spent a lot of this system for a mostly unreliable system. And that I can blame on the vendor...
 
Maybe, I cannot judge that but I have installed some hundreds of IP Offices and I can say that since R7 reboots like you have are very rare. in fact we have no customers with that kind of problems.
The whole point is to find what causes the reboot and find a way around it or get a fix from Avaya.
I am very experienced in fault finding in IP Office but spurous reboots aren't the easiests to find.
What at least is necessery is a monitor log file which holds lets say the last five minutes before the reboot.
 
@intrigrant how can I get to that log, or force the system to log during that time? The system should reboot itself somewhere around 4 or 5 in the morning. I'm slightly familiar with the manager so far and the system status... but none of these logs or alerts seem very useful.
 
Find the program called Monitor.
Start it and connect using the IP address and the password.
This password is default password so try that.
When it is running then file -> log preferences. set log mode to every hour and be sure that log to file is turned on.
Select a log filename (text file)and let it run.
When the reboot occurs then go to monitor and press the button rollover log (green bend arrow)
Find the log get the last couple of minutes of that log before the reboot.
Include the reboot in that log and put it here.
If there is any info that is secret then replace that.

BAZINGA!

I'm not insane, my mother had me tested!
 
@tlpeter Thank you very much. Got monitor up and running and the very first thing it said was this:

102614mS PRN: +++ START OF ALARM LOG DUMP +++
102614mS PRN: ALARM: 03/08/2015 18:25:08 IP 500 V2 8.1(95) <FileSystemSMXFSTask::GetDriveNum: Bad d> CRIT RAISED addr=f03562bc d=5 pc=f01bf568 f01cc0dc f01bf950 f0dd4be4 f03581d4 f030b200
102614mS PRN: ALARM: 04/08/2015 05:29:25 IP 500 V2 8.1(95) <FileSystemSMXFSTask::GetDriveNum: Bad d> CRIT RAISED addr=f03562bc d=5 pc=f01bf568 f01cc0dc f01bf950 f0dd4be4 f03581d4 f030b200
102614mS PRN: ALARM: 04/08/2015 19:45:47 IP 500 V2 8.1(95) <FileSystemSMXFSTask::GetDriveNum: Bad d> CRIT RAISED addr=f03562bc d=5 pc=f01bf568 f01cc0dc f01bf950 f0dd4be4 f03581d4 f030b200
102614mS PRN: ALARM: 05/08/2015 06:50:04 IP 500 V2 8.1(95) <FileSystemSMXFSTask::GetDriveNum: Bad d> CRIT RAISED addr=f03562bc d=5 pc=f01bf568 f01cc0dc f01bf950 f0dd4be4 f03581d4 f030b200
102615mS PRN: ALARM: 05/08/2015 18:52:00 IP 500 V2 8.1(95) <FileSystemSMXFSTask::GetDriveNum: Bad d> CRIT RAISED addr=f03562bc d=5 pc=f01bf568 f01cc0dc f01bf950 f0dd4be4 f03581d4 f030b200
102615mS PRN: ALARM: 06/08/2015 05:56:17 IP 500 V2 8.1(95) <FileSystemSMXFSTask::GetDriveNum: Bad d> CRIT RAISED addr=f03562bc d=5 pc=f01bf568 f01cc0dc f01bf950 f0dd4be4 f03581d4 f030b200
102615mS PRN: ALARM: 06/08/2015 18:27:59 IP 500 V2 8.1(95) <FileSystemSMXFSTask::GetDriveNum: Bad d> CRIT RAISED addr=f03562bc d=5 pc=f01bf568 f01cc0dc f01bf950 f0dd4be4 f03581d4 f030b200
102615mS PRN: ALARM: 07/08/2015 05:32:17 IP 500 V2 8.1(95) <FileSystemSMXFSTask::GetDriveNum: Bad d> CRIT RAISED addr=f03562bc d=5 pc=f01bf568 f01cc0dc f01bf950 f0dd4be4 f03581d4 f030b200
102615mS PRN: ALARM: 07/08/2015 19:07:59 IP 500 V2 8.1(95) <FileSystemSMXFSTask::GetDriveNum: Bad d> CRIT RAISED addr=f03562bc d=5 pc=f01bf568 f01cc0dc f01bf950 f0dd4be4 f03581d4 f030b200
102615mS PRN: ALARM: 08/08/2015 06:12:16 IP 500 V2 8.1(95) <FileSystemSMXFSTask::GetDriveNum: Bad d> CRIT RAISED addr=f03562bc d=5 pc=f01bf568 f01cc0dc f01bf950 f0dd4be4 f03581d4 f030b200
102616mS PRN: ALARM: 08/08/2015 17:16:33 IP 500 V2 8.1(95) <FileSystemSMXFSTask::GetDriveNum: Bad d> CRIT RAISED addr=f03562bc d=5 pc=f01bf568 f01cc0dc f01bf950 f0dd4be4 f03581d4 f030b200
102616mS PRN: ALARM: 09/08/2015 04:20:51 IP 500 V2 8.1(95) <FileSystemSMXFSTask::GetDriveNum: Bad d> CRIT RAISED addr=f03562bc d=5 pc=f01bf568 f01cc0dc f01bf950 f0dd4be4 f03581d4 f030b200
102616mS PRN: ALARM: 09/08/2015 16:25:08 IP 500 V2 8.1(95) <FileSystemSMXFSTask::GetDriveNum: Bad d> CRIT RAISED addr=f03562bc d=5 pc=f01bf568 f01cc0dc f01bf950 f0dd4be4 f03581d4 f030b200
102616mS PRN: ALARM: 10/08/2015 03:29:25 IP 500 V2 8.1(95) <FileSystemSMXFSTask::GetDriveNum: Bad d> CRIT RAISED addr=f03562bc d=5 pc=f01bf568 f01cc0dc f01bf950 f0dd4be4 f03581d4 f030b200
102616mS PRN: ALARM: 10/08/2015 18:31:55 IP 500 V2 8.1(95) <FileSystemSMXFSTask::GetDriveNum: Bad d> CRIT RAISED addr=f03562bc d=5 pc=f01bf568 f01cc0dc f01bf950 f0dd4be4 f03581d4 f030b200
102616mS PRN: ALARM: 11/08/2015 06:35:13 IP 500 V2 8.1(95) <FileSystemSMXFSTask::GetDriveNum: Bad d> CRIT RAISED addr=f03562bc d=5 pc=f01bf568 f01cc0dc f01bf950 f0dd4be4 f03581d4 f030b200
102617mS PRN: ALARM: 11/08/2015 18:17:48 IP 500 V2 8.1(95) <FileSystemSMXFSTask::GetDriveNum: Bad d> CRIT RAISED addr=f03562bc d=5 pc=f01bf568 f01cc0dc f01bf950 f0dd4be4 f03581d4 f030b200
102617mS PRN: ALARM: 12/08/2015 05:22:05 IP 500 V2 8.1(95) <FileSystemSMXFSTask::GetDriveNum: Bad d> CRIT RAISED addr=f03562bc d=5 pc=f01bf568 f01cc0dc f01bf950 f0dd4be4 f03581d4 f030b200
102617mS PRN: ALARM: 12/08/2015 18:27:07 IP 500 V2 8.1(95) <FileSystemSMXFSTask::GetDriveNum: Bad d> CRIT RAISED addr=f03562bc d=5 pc=f01bf568 f01cc0dc f01bf950 f0dd4be4 f03581d4 f030b200
102617mS PRN: ALARM: 13/08/2015 06:31:24 IP 500 V2 8.1(95) <FileSystemSMXFSTask::GetDriveNum: Bad d> CRIT RAISED addr=f03562bc d=5 pc=f01bf568 f01cc0dc f01bf950 f0dd4be4 f03581d4 f030b200
102617mS PRN: ALARM: 13/08/2015 18:40:20 IP 500 V2 8.1(95) <FileSystemSMXFSTask::GetDriveNum: Bad d> CRIT RAISED addr=f03562bc d=5 pc=f01bf568 f01cc0dc f01bf950 f0dd4be4 f03581d4 f030b200
102617mS PRN: ALARM: 14/08/2015 05:44:37 IP 500 V2 8.1(95) <FileSystemSMXFSTask::GetDriveNum: Bad d> CRIT RAISED addr=f03562bc d=5 pc=f01bf568 f01cc0dc f01bf950 f0dd4be4 f03581d4 f030b200
102618mS PRN: ALARM: 14/08/2015 18:20:56 IP 500 V2 8.1(95) <FileSystemSMXFSTask::GetDriveNum: Bad d> CRIT RAISED addr=f03562bc d=5 pc=f01bf568 f01cc0dc f01bf950 f0dd4be4 f03581d4 f030b200
102618mS PRN: ALARM: 15/08/2015 06:25:12 IP 500 V2 8.1(95) <FileSystemSMXFSTask::GetDriveNum: Bad d> CRIT RAISED addr=f03562bc d=5 pc=f01bf568 f01cc0dc f01bf950 f0dd4be4 f03581d4 f030b200
102618mS PRN: ALARM: 15/08/2015 17:29:30 IP 500 V2 8.1(95) <FileSystemSMXFSTask::GetDriveNum: Bad d> CRIT RAISED addr=f03562bc d=5 pc=f01bf568 f01cc0dc f01bf950 f0dd4be4 f03581d4 f030b200
102618mS PRN: ALARM: 16/08/2015 04:33:47 IP 500 V2 8.1(95) <FileSystemSMXFSTask::GetDriveNum: Bad d> CRIT RAISED addr=f03562bc d=5 pc=f01bf568 f01cc0dc f01bf950 f0dd4be4 f03581d4 f030b200
102618mS PRN: ALARM: 16/08/2015 15:38:04 IP 500 V2 8.1(95) <FileSystemSMXFSTask::GetDriveNum: Bad d> CRIT RAISED addr=f03562bc d=5 pc=f01bf568 f01cc0dc f01bf950 f0dd4be4 f03581d4 f030b200
102618mS PRN: ALARM: 17/08/2015 02:42:21 IP 500 V2 8.1(95) <FileSystemSMXFSTask::GetDriveNum: Bad d> CRIT RAISED addr=f03562bc d=5 pc=f01bf568 f01cc0dc f01bf950 f0dd4be4 f03581d4 f030b200
102619mS PRN: ALARM: 17/08/2015 18:25:19 IP 500 V2 8.1(95) <FileSystemSMXFSTask::GetDriveNum: Bad d> CRIT RAISED addr=f03562bc d=5 pc=f01bf568 f01cc0dc f01bf950 f0dd4be4 f03581d4 f030b200
102619mS PRN: ALARM: 18/08/2015 05:29:37 IP 500 V2 8.1(95) <FileSystemSMXFSTask::GetDriveNum: Bad d> CRIT RAISED addr=f03562bc d=5 pc=f01bf568 f01cc0dc f01bf950 f0dd4be4 f03581d4 f030b200
102619mS PRN: ALARM: 18/08/2015 19:50:10 IP 500 V2 8.1(95) <TLB Data Load Error > CRIT RAISED addr=d123f5cb d=5 pc=f031172c f03115e8 f03117d4 f0179ea4 f015fcd8 f030b1e0
102619mS PRN: ALARM: 19/08/2015 06:54:12 IP 500 V2 8.1(95) <FileSystemSMXFSTask::GetDriveNum: Bad d> CRIT RAISED addr=f03562bc d=5 pc=f01bf568 f01cc0dc f01bf950 f0dd4be4 f03581d4 f030b200
102619mS PRN: +++ END OF ALARM LOG DUMP +++


however I was unable to start it prior to it restarting this morning. I'll try to catch it tomorrow, in fact, I didn't even know it was rebooting in the evenings, but apparently it is (according to this log!) So I may be able to catch it sometime this afternoon. I have no idea what that TLB Data Load Error is. I'll also be checking on our support status today as well... I'll keep you guys posted, thanks so much for your help!
 
You can recognize a reboot easily, the most left column shows the uptime in milliseconds, a reboot makes it start with 0mS.
So this log doesn't show a reboot.
 
@intrigrant Right, I see what you are saying, but I just think that is a list of the last logs. This is what popped up right after (102 seconds after) rebooting it this morning. I assure you that each time that comes across, it reboots. That must just be a dump (as it says) of the alarm log.
 
The log is indeed just after a startup.
The alarms does give me the idea that your SD card is corrupt, did you have had a new one? It is possible to order a new one and then swap the licenses from the old card to the new one, a SD cards cost about 30 euro I believe and a license swap is free of charge if you return the old SD card within ten working days after the license swap.
Maybe your vendor has a demo kit with a SD Card which you can borrow for a few days, it may not have all the licenses you need but just to try for a few days would be interesting. In monitor you can erase the alarm log as only Avaya has use of it (sometimes) but I have experienced in the past that if they did not have a similar case in the past they cannot do anything with the alarm log, they will ask as I did for a log of the last 10 to 60 minutes before a reboot to find out the cause of the reboot.
It can be anything, I have seen call pickup as a cause of a reboot, or dialling a number too fast or dialling too much digits or whatever.
 
@intrigrant Supposedly they did use a new SD card (or switch it out) however I am inclined to side with you on this one. I will talk to the vendor and see if they can supply a new one. I should also have a log of before and after the alarm reboots the system tonight or tomorrow morning. I'll post it as soon as I have it. Thanks again for all of your help.
 
Ok, I've caught it rebooting. Here is the log in question just before, and just after:

39802183mS H323Evt: Recv: RegistrationRequest XXX.XXX.XXX.39; Endpoints registered: 92; Endpoints in registration: 0
39802930mS H323Evt: Recv: RegistrationRequest XXX.XXX.XXX.23; Endpoints registered: 92; Endpoints in registration: 0
39803887mS RES: Wed 19/8/2015 18:42:28 FreeMem=50155412 49381900(2) CachedMem=773512 CMMsg=6(7) Buff=5200 1416 1000 12364 5 Links=21689 BTree=0 CPU=51/189/3933/12555/18569/1
39803888mS RES2: IP 500 V2 8.1(95) Tasks=53 RTEngine=0 CMRTEngine=0 ExRTEngine=0 Timer=242 Poll=0 Ready=0 CMReady=0 CMQueue=0 VPNNQueue=0 Monitor=3 SSA=1 TCP=200 TAPI=0 ASC=1 SYS=MNTD OPT=UMNT SDSPD=2034
39803888mS RES4: XML MemObjs=45 PoolMem=2097152(1) FreeMem=2082064(1)
39806669mS H323Evt: Recv: RegistrationRequest XXX.XXX.XXX.100; Endpoints registered: 92; Endpoints in registration: 0
39806854mS H323Evt: Recv: RegistrationRequest XXX.XXX.XXX.23; Endpoints registered: 92; Endpoints in registration: 0
39807472mS H323Evt: Recv: RegistrationRequest XXX.XXX.XXX.87; Endpoints registered: 92; Endpoints in registration: 0
39808643mS PRN: Alarm log cyclic wrap
39808643mS PRN: Begin Stack Trace
39808644mS PRN: pc=f01bf568
39808644mS PRN: lr=f03562bc
39808644mS PRN: findfunc f01cc0dc f01bf950 f0dd4be4 f03581d4 f030b200 f030b1ac 00000000 00000000 00000000 00000000
39808644mS PRN: End Stack Trace
39808644mS PRN: ScheduleHistory NOT enabled: Task = MemoryCardMonitor File = ../platform/platform.cpp Line = 1186
39808644mS PRN: .FATAL FileSystemSMXFSTask::GetDriveNum: Bad d address=f03562bc d=5 pc=f01bf568 f01cc0dc f01bf950 f0dd4be4 f03581d4 f030b200 IP 500 V2 8.1(95)

********** contact lost with XXX.XXX.XXX.4 at 18:46:23 19/8/2015 - reselect = 1 **********
******************************************************************

********** SysMonitor v10.1 (63) **********

********** contact made with XXX.XXX.XXX.4 at 18:47:31 19/8/2015 **********

********** System (XXX.XXX.XXX.4) has been up and running for 26secs(26326mS) **********

********** Warning: TEXT File Logging selected **********


********** Warning: TEXT Logging to File STARTED on 19/8/2015 18:47:31 **********
26326mS PRN: Monitor Started IP=XXX.XXX.XXX.203 IP 500 V2 8.1(95) MedFusion
(IP Office: Supports Unicode, System Locale is default)
26327mS PRN: LAW=U PRI=2, BRI=0, ALOG=4, VCOMP=42, MDM=0, WAN=0, MODU=1 LANM=0 CkSRC=0 VMAIL=0(VER=0 TYP=1) 1-X=0 CALLS=0(TOT=0)
26536mS RES: Wed 19/8/2015 18:43:52 FreeMem=56559868 56425892(1) CachedMem=133976 CMMsg=2(3) Buff=5200 1436 1000 12381 5 Links=28461 BTree=0 CPU=0/0/4294967295/0/0/0
26536mS RES2: IP 500 V2 8.1(95) Tasks=45 RTEngine=0 CMRTEngine=0 ExRTEngine=0 Timer=100 Poll=0 Ready=0 CMReady=0 CMQueue=0 VPNNQueue=0 Monitor=2 SSA=0 TCP=15 TAPI=0 ASC=1 SYS=MNTD OPT=UMNT SDSPD=2034
26536mS RES4: XML MemObjs=0 PoolMem=0(0) FreeMem=0(0)
27220mS PRN: A:\system\ws already existed with attr 10
27221mS RES3: Tasks=(6227)StartUp (69)Daemon (22)FileSystemSMXFS
27221mS RES: Wed 19/8/2015 18:43:52 FreeMem=54250412 54119204(1) CachedMem=131208 CMMsg=2(3) Buff=5200 1433 1000 12364 5 Links=27913 BTree=0 CPU=467/583/478802/478802/478802/0
27221mS RES2: IP 500 V2 8.1(95) Tasks=51 RTEngine=0 CMRTEngine=0 ExRTEngine=0 Timer=95 Poll=0 Ready=0 CMReady=0 CMQueue=0 VPNNQueue=0 Monitor=2 SSA=0 TCP=15 TAPI=0 ASC=1 SYS=MNTD OPT=UMNT SDSPD=2034
27222mS RES4: XML MemObjs=45 PoolMem=2097152(1) FreeMem=2082064(1)
27222mS PRN: Config Write Wake Up
27290mS LIC: INFO: 1 power user licenses 0 0
27292mS PRN: Created Tone MOH Source 1
27292mS PRN: Playing tone MOH Source 1
27293mS PRN: A:\system\ws\up already existed with attr 10
27294mS PRN: A:\system\ws\br already existed with attr 10
27298mS PRN: System Started from System Card, Primary Directory
27299mS PRN: WARNING:
27299mS PRN: System was not Shutdown correctly
27299mS PRN:
27299mS PRN: Last Shutdown/Reboot: Wed 19/08/2015 07:38:29


 
Try to recreate the SD card.

BAZINGA!

I'm not insane, my mother had me tested!
 
My assembler skills are pretty rusty but a bad d (double word) means that the CPU is referencing in this case I think a pagefile address that is protected. Pagefile faults are trappable which is what the IPO was able to do then rebooted rather than just hang. How full is your SD card? In any event only Avaya can diagnose this.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top