Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations SkipVought on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Log files erased on reset

Status
Not open for further replies.

chang542

Technical User
Oct 25, 2005
20
0
0
US
I'm running 4 8610's as 2 switch blocks with ISTs and an SMLT between them. We ran into a problem a few days back when all four were hung and pegged at 100% CPU. the only way to get them back online was to kill the power and reset them.

Once they were reset, I was going through the system log the following day and they had a huge gap in the files. I was missing entries for the entire week leading up to that outage. I had looked at my logs during the week, and saw many entries in there, however they were all now missing.

Has anybody seen this type of thing? A hard reset corrupts your log and it starts overwriting from about the same random point in time on all 4 passports? Nortel is not providing any help and in my own attempts to reproduce this issue have all been unsuccessful. I can cut the power, kick it, and all the logs remain just as they were.

On a hard reset like that I found that the system will stop writing to the current xxx.000 file and begin writing to a new one. However that old .000 file should still be there with all the info.
 
I've never had that particular issue, but when we had 8690s and 8691s I had a few of those PCMCIA cards go bad. Eventually I switched to compact flash cards and PCMCIA adapters. I didn't have any further problems, but we've switched to 8692 CPUs now.

It sure sounds like a corrupt filesystem on the flash cards, but it'd be weird that it happened to them all. If there is filesystem corruption it'd be worth reformatting them to make sure they don't continue to have issues. We also do syslogging to a remote collector, although if you're having real issues a local log is more complete.

Speaking of real issues, did you find the root cause of your pegged CPUs?

 
we have logging to PCMCIA enabled, so the flash isn't affected. but the file system appears fine. It works and writes the the logs with no issue. I've been saving them every day since then and haven't missed a beat.

generally, on a reset, the log file will stop writing to the .000 file and start writing to a new one.

In my logs I just found something interesting, on all 4 (keep in mind nobody has ever heard of this on just 1 passport, let alone 4) the logging ends at the exact same point on each passport on a sunday, just after the 10th card initialized. Then there is nothing for the entire week, and then they all start logging again at bootup the following saturday... AND that .000 file ends at the exact same point just after the 10th card initialized.

In addition to that, the regular syslog has those same entries from bootup on that saturday but keeps going up until the present day.

the system wouldn't log to 2 different files simultaneously would it?

From the looks of it, it appears to be a copy and paste from the regular syslog and those entries were put into the old log file, and the weeks logs were erased manually.

can anybody come up with any other theories?
 
oh and in regard to the cause of the problem....

Is anybody familiar with a McAfee product called Foundstone?

Its a scanning software that scans any IP enabled device for pretty much anything that would be considered a "vulnerablity" that is known to man.

Has anyone ever heard of or experienced any issues on their network with it?
 
Sorry I was also talking about the PCMCIA flash cards as well, not the internal flash... I should have been more specific.

By 'regular syslog' do you mean a remote syslog server?

I haven't heard of Foundstone, but we did have a spike in CPU usage when our security guys did their last security scans... the high CPU usage stopped after they moved on to other devices. I think we were running 4.0.2 code at that time.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top