Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Chris Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Backup fails 'Directory Not Found' & Corrupt job log entry

Status
Not open for further replies.

Yorkshireman2

Programmer
Jan 21, 2005
154
CA
I am new to Server maintenance. Our Backup exec 9.1 fails the same backup job every day with 'directory not found"
The job log cannot open due to an error in the xml load process; then the text load freezes up and the whole program has to close.
I found the xml file and opened it in a text editor and found the offending line was corrupt. The directory-not-found line had a non-existent path with corrupt characters (like SOH in black background and foreign characters with accents above, and square symbols.)
This line was in the middle of backing up files with names that sound like 'symantec SMSE brightmail' or something like that (connected with symantec anti-spam rules I think).

So I removed the surrounding lines from the job selection list and tried again that night. It failed again- this time the error line was further down but still in the brightmail list.
So I removed all brightmail selections and it backed up ok, although other files were still skipped because they were in use.
Next day it failed again with 'directory not found'. It is still like this.

Could this simply be caused by low disk space? The system partition on this Dell sever is only 7GB and has only 298MB free. I wonder if the backup software needs a lot of disk space for temp files when it runs?

Also, does anyone know if the brightmail rules really need backing up, or can I leave them out of the backup list?






Yorkshireman2
 
Update: the disk partition now has 1.2GB free space and backup worked a few times but now the backup failed again in the same way.
The error in the corrupted job log was again in the brightmail entries.
I can only try removing ALL brightmail entries from the selection list.

Any ideas what causes this?


Yorkshireman2
 
Hey there, i had a similar issue recently, i deleted then recreated the job and the problem disappeared, on another site i edited the job selection to include everything and submitted the job, then edited it to exclude files again, and submitted and run it

Both instances resolved the issue

Let us know how you get on

Jamie
 
Hi Jaime,

I have created a new job as you suggest with all selected, then deselected some just as the original job was. Took ages to check each directory to see what was deselected. In most cases a slash (denoting partial selections) was there but EVERY file in the sub directories was ticked. Odd, eh?

Anyway, I'll try tonight.

Last night's error (with all brightmail deselected) was a different error- "Cancelled, timed out" because it exceeded the 13 hour preset window (it had backed up just over 75GB).
The last 'success with exceptions' backed up about 74GB and only just completed within 13 hours.
Is this taking too long for 75GB?

I think I'll reset the window for 15 hours to see what happens tonight.

Chris


Yorkshireman2
 
Jamie,

The 15 hours prevented it timing out BUT now its back to the 'directory not found' error and once again a corrupt line in the job log in the middle of.... brightmail files! I thought I had deselected those but I must have forgotten while checking every other file.
So I am trying again tonight with all brightmail files deselected.

The last time this job was successful (with exceptions) it took just under 13 hours, so the 15 hours I set last night probably prevented that error masking this one


I still wonder if 12/13 hours is too long for 75 GB of data to back up (I think that was about 141,000 files).

Its a quantum DLT tape drive running from a scsi adaptor in a pci slot.

I also checked the media eject-alert cancelling configuration- I read somewhere that if the tape sits in the drive ejected but not removed in time then it can cancel the job with a timeout error. It is now set to automatically respond with yes/ok after 5 minutes.


Chris



Yorkshireman2
 
Hey Chris sorry for the delay getting back to you,

The speed thing, if its a DLT i would expect better through put than that, have you checked your drives for fragmentation, excluded the backup exec folders from your antivirus etc, if its badly fragmented it takes a while to back up so id check that, its commonly overlooked issue

The corrupt folders, hopefully excluding the live files will work for you. let us know how you get on though, and good luck

Jamie
 
Jamie,

Thanks for your input. I defragmented the system partition just recently but your note about "..excluded the backup exec folders from your antivirus etc." caught my attention. I didn't set up the backup jobs originally and I am still new to it, so I would like to check this. Can you explain what you mean?
Maybe something like that has been missed.

Meanwhile, last night the DLT failed again AND now the DAT backup failed too(its been reliable up to now).
The DAT backup timed out. I wonder if its because I extended the DLT job completion window by 2 hrs.
The DLT is scheduled to start at 7pm while the The DAT backup is scheduled to start at 7am the next morning.



Chris



Yorkshireman2
 
Hey there,

Can you check your logs and let us know the times its taking per partiion and service (eg Exchange information store, SQL databases, Remote servers) all this is availible in the logs, or if you can save the logs and let us see them that would be a good help to get you going
 
Hi Jamie,

Not much point now- I went in today (Long weekend) to try replacing a drive in the Raid array. It seemed like a good idea while nobody was there over the weekend.
The server will not boot up now; it seems I have brought the whole business down. We now have no email, no database --nothing! I don't know what the boss will say when I get in on Tuesday. No customers can reach us now and no sales personnel can reach the sales data.

I noted all this in the other thread I have open in the DELL servers forum. A user caled Technome was writing to me about it.




Yorkshireman2
 
The server is back up and panic over.
The boss has decided we can update our backup exec software to both eliminate any problems due the state of the software and get tech support for the software from Symantec. Not a bad move probably.

The only thing is, I suspect the new software needs more disk space than the old one (new software usually does) so I am proceeding to replace the remaining two small disks in our Raid array, let them rebuild, and then increase the system partition to give loads of room for updates, temp files etc.
THEN, install the new backup exec.

I'll let you know how it goes.

Yorkshireman2
 
Well, a couple of weeks ago I tried replacing the first of those disks and rebuilt the new one. It never rebooted.
It could not find the bootable drive. The server is now dead.
When someone else tried rebuilding the server and then tried to restore the last successful back up tapes, it wouldn't restore, apparently. So everything is in a shambles. Still trying to get our second server running as primary domain controller. You can see the story at:



Thanks again for help.

Yorkshireman2
 
News on that server-

After that 'third pary' couldn't restore it, the server sat there for a week while we tried to get the other server to run as the primary domain controller and cure its back up troubles (it was failing because it was looking for its alternate idr back up path on the dead server).

Then I began looking at the dead server. After some discusion with Symantec's backup exec support it became clear that the third party had not done it in the correct sequence. Apparently the best way is to rebuild the server as a workgroup then restore the c drive and system state then reboot to get it back on the domain.

So I rebuilt the server from scratch, installed windows 2000 SBS and installed updates to where I thought it was before it failed then I installed the new backup exec 12.5 that we had ordered just before it failed.
I inventoried the tape but it wouldn't catalog- kept failing with i/o errors.
I examined the scsi cable from server to Tape drive and noticed a kink in the cable near the plug. Hmmmm. I wiggled it and tried again - this time the errors were different. Hmmmm. Bad cable!

I tried to get a replacement cable on the Friday but no-one had one- they kept saying these scsi cables were obsolete now and no-one used them. Next Monday a friend at a college found one and I tried it- Successful catalog!

So I then restored C drive and system state- rebooted- now its on the domain again.

Now I just have to figure out how to re-install exchange (and find which of the Windows 2000 SBS CDs it's on) and then try to restore the exchange database and I hope I can somehow extract the users' mail box data and folders.

The hope is to recover and use that data to migrate to the other server where we are trying to install exchange 2007.





Yorkshireman2
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top