I am brand new to a situation with 125 servers using BE9 on Win2000 servers. No monitoring of the backups is being done currently - I need a good policy and setup to make sure the backups are working every night.
Is there a good way to monitor that number of servers centrally?
Options I know of:
1)Email notifications. This would work sort of. Problems I see: a) to track if a server did not send notification would take checking all server emails off a list every day to make sure they all reported; b) can't rely on the summary success/fail because it only lists if the processes started -- have to read each entire job report to for files skipping and verify problems -- that is a lot of email attachments to read; c) our guys who run the email servers are prone to take down the internal SMTP server from time to time, I would have to run my own SMTP server for 100% reliability (which for various reasons I don't want to do).
2)Rely on local staff to check and report problems. This is has drawback that hard/impossible to know if the checking is actually happening.
More info:
The servers are geographically dispersed, local staff switches the tapes, each server has it's own BE9 and tape drive, all sites connected to central with DSL with 128 upload.
My goals:
Get reporting to make sure: a) jobs are running nightly b) tape drives are functional c) all significant errors get followed up on.
Is there a good way to monitor that number of servers centrally?
Options I know of:
1)Email notifications. This would work sort of. Problems I see: a) to track if a server did not send notification would take checking all server emails off a list every day to make sure they all reported; b) can't rely on the summary success/fail because it only lists if the processes started -- have to read each entire job report to for files skipping and verify problems -- that is a lot of email attachments to read; c) our guys who run the email servers are prone to take down the internal SMTP server from time to time, I would have to run my own SMTP server for 100% reliability (which for various reasons I don't want to do).
2)Rely on local staff to check and report problems. This is has drawback that hard/impossible to know if the checking is actually happening.
More info:
The servers are geographically dispersed, local staff switches the tapes, each server has it's own BE9 and tape drive, all sites connected to central with DSL with 128 upload.
My goals:
Get reporting to make sure: a) jobs are running nightly b) tape drives are functional c) all significant errors get followed up on.