Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

sanity checklist for solaris servers

Status
Not open for further replies.

ssaachi

Technical User
Sep 24, 2001
42
US
Does anyone have a sanity checklist ... basic things to check for to make sure all is well running well on a server
- some thing on the lines of daily basis , weekly basis and monthly basis.
Thanks
 
Hmm... here's a start:

[ul][li]Disk space - df -k[/li]
[li]Error messages, hardware failures - view /var/adm/messages - if you do this less often than weekly look at the messages.0, 1, etc. files for the previous weeks.[/li]
[li]CPU usage - sar - enable this by uncommenting the jobs in user sys' crontab. Statistics are kept in /var/adm/sa for two weeks by default.[/li]
[li]Volume manager status - vxprint, metastat, or whatever is appropriate for the software you are using, if any.[/li]
[li]Uptime - uptime - just to see if there has been a reboot you didn't know about![/li][/ul]

I'm sure there will be more. The frequency of each check depends on how busy or critical the system is. It's probably best to automate them with a script and check them daily. Annihilannic.
 
If you have more than one server to check it is useful to set up one as the logserver so that all the syslog error messages go to a central server which you check regularly.

As well as the checks Annihilannic has suggested, depending on how security conscious you are, you might want to create and check /var/adm/loginlog (see man page on loginlog) and also /var/adm/sulog (grep for ' - ' failed su attempts).

Depending on the applications running (webserver, database etc.) you will probably want to check these processes are running and a simple web page GET test or some SQL to check your web page/database is still responding. These would need to be run on a frequent basis to pre-empt any loss of service. Also you need to check any scheduled backups have worked each day and on a weekly basis tidy any old core dumps lying around (coreadm on Solaris 8 makes this much easier).

You could end up putting a lot of work into a monitoring system and find out you're re-inventing the wheel. There are a lot of products commercial and shareware to do this type of thing. BMC Patrol (commercial ) and Big Brother (free for non-commercial use are but two. These can be tailored and provide multi-platform support with lots of bells and whistles such as web based monitoring, pager, helpdesk and NMS integration.

JB
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top