Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations SkipVought on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Random System hanging

Status
Not open for further replies.

chikn

IS-IT--Management
Aug 20, 2001
62
0
0
US
This really isnt random but kindof. My SCO 5.0.4 server will stop responding atleast 2 times a day in the morning when first shift operations start and the evenings a couple hours after 2nd shift operations start.

When you attempt to login when this happens you will get the login prompt and be able to enter your password then nothing it just sits there. If you have a terminal already up when this happens you can still move around however alot of programs like out main cobol programs will not run. The last entries in the syslog when this happens are unable to find user in protected password database(or something to that affect). It will do this for every attempt you make to login to the system when it freaks.

There is no specific application it stops on and this just started 5 days ago and no system changes were made. The only thing I find odd with it is when I verify system from custom It seems to find the same discrepencies with passwd, group and the vision mapfile. And when I view my group file in /etc the number of users listed by group 50 are around 20 when I have over 380 users in the passwd list. Why arent all my users in group 50 showing in the group file under group 50. Might not be related to the system stop responding but I dont understand why it isnt.
 
Has anything changed in the past few weeks? IE added more memory, a new board, installed new or upgraded software etc?

Did you have an uncontrolled powerdown (UPS ran out overnight due to power fial etc)?

Have you run an fsck -ofull on all the disk drives?

Could be a developing a hardware problem with memory, hard drives or even the internal power supply.

Shift startups/changes suddenly stress all of the above, pushing borderline hardware over the edge.

Have you checked the system logs for possible complaints from one of the above components? (/usr/adm/messages and/or syslog)

Pat Welch, UBB Computer Services
Caldera Authorized Partner
Unix/Linux/Windows/Hardware Sales/Support
(209) 745-1401 Fax: (413) 714-2833
 
About your user and group question: The /etc/group file is the database which tracks group membership. If it is showing only 20 of your 380 users in group 50, that really does mean that only 20 of your users have membership in that group. You are confusing group membership the user default group setting from the /etc/passwd file. The default group setting is the group id assigned to any file that user creates on the system. A user is not automatically made a member of their default group when you create the user ID, rather you must add membership when you create the user, or the two files will not correlate.

To add group membership, use the scoadmin "Account Manager", view by groups, then modify the group adding the users you want.

Your passwd and group file discrepancies probably aren't related to your problem, however the number of users concurrently connected may be. You say this started during the past 5 days. Did this possibly correlate with an increase in user connections? It would not necessarily require a large increase if your system was already close to maxing out its resources. 380 users is a large number of connections which require a significant amount of resources. How many psudo-tty's do you have configured? You should probably try increasing the psudo-tty's. Your system may also need more memory to handle all of the users.
 
Thanks for your replies-

PatW- I dont believe its hardware related and no changes have been made to the system at all. I have noticed when the system stops responding and I do an uptime on the console (if there was a console left open), that my usual load average of 7 to 12 is now well above 250 WOWO!! Is it possible to discover the process that is causing the load?
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top