Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations derfloh on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Mysterious "No space on dev hd (1/42)" ... Whats d' deal?

Status
Not open for further replies.

josel

Programmer
Oct 16, 2001
716
US
Howdy!

Using SCO 5.0.6a running on a Compaq Proliant 7000 with 18 18.2GB HDs with an RS3200 Controller on RAID 5 and 4 CPUs


Late Friday night (6/11), my server started to show this error.

WARNING: err: Error log overflow
CPU3: NOTICE: HTFS: No space on dev hd (1/42)
CPU4: NOTICE: HTFS: No space on dev hd (1/42)

By the time I was able to come to the office, the error itself was only visible in /usr/adm/messages but no longer scrolling on my console.

Running df -v shows:

Mount Dir Filesystem blocks used free %used
/ /dev/root 7168000 1539342 5628658 22%
/stand /dev/boot 30720 30720 0 100%
/data /dev/data 351895308 210797414 141097894 60%
/fpdev /dev/fpdev 24576000 938524 23637476 4%
/appl /dev/appl 40960000 12713020 28246980 32%

Just a couple of minutes ago, I saw the same crolling up the screen and I jumped off my chair. I ran df -v and it showed that / was 100% used.

I do not know of any process where
1) I write to /
2) Could possibly use close to 3Gs of disk space

Even if I dumped one of my largest files it would not be this big.

I feel as though I have something to be concerned about and would not like it to find me unprepared. Could you guys point me in a direction where I might find the root cause.

The error log overflow message gest to me. What does it mean? What error log is it? Could it be something as simple as a tunable variable?

Thank you all in advance for your assistance!


Jose Lerebours

KNOWLEDGE: Something you can give away endlessly and gain more of it in the process! - Jose Lerebours
 
Go to /var/adm and look for biiiig files in a subdirectory named cpqmon or something like that (not sure because i'm not on a compaq box right now).

ex:
# l -R|sort -k5 -n

empty it using:
# > hugefilename

I think Compaq's system monitors for 5.0.6 used to have a bug and didn't empty some log file.

Theophilos.

-----------

There are only 10 kinds of people: Those who understand binary and those who don't.
 
That's the first place I went and files messages and syslog were pretty small.

I am afraid I now have the dawnting task of checking the time it occurred and match that to all crons that may have been active and see what they do and how this could have happened.

They do not pay me enough for this :-(

Regards;


Jose Lerebours

KNOWLEDGE: Something you can give away endlessly and gain more of it in the process! - Jose Lerebours
 
As it has been a few days since your post, if you still have not solved this issue...

1. try searching for large files:
find /stand -size +10000

you will probably find something extremely large, or you may in fact have a ton of smaller files, so reduce the size if you need to.

2. check farther back in your /usr/adm/messages and /usr/adm/syslog in case you have any info left in it that might show another error prior that might relate.

3. from your 'df -v' output, only /stand is full. that 30,720 blocks works out to be about 15 mb, not 3 gb. on my systems, we have a similar size and use only 20194 blocks. check your lost+found on /stand, could have something there... as it is relatively small, something insignificant could have been written there.

4. Did you re-compile your kernel lately? check for any errros, like a core file that might have dumped to there possibly.

i'm still thinking on other possible things you might check, but there should generally not be very much in /stand, so you should be able to parse the directory manually reasonably quickly. if you can't there's a first sign something's out of whack.

good luck!

[afro2]
 
Well, I think I did solve it ... I turned out to be a report printed to /tmp/[somefilename.txt] which exceeded 2.0GB and I have barely over 3.0.

When I found the file, thank God it fit, I was able to identify source and vented out at the operator printing such large report to a file (or for printing it at all).

I cought this by using df -v and mailing it to myself and I was able to spot the disk usage ... it was at 88% from 22% the day before. I then when hunting for large files in the usual places (as per our application) and there it was.

To remedy this, I have a cron to clean all temporary directories off files which are likely temporary repository for reports and processes. I am also removing stored files 120+ days old ... of course, daily reports files.

Thank you all for replying!


Jose Lerebours




KNOWLEDGE: Something you can give away endlessly and gain more of it in the process! - Jose Lerebours
 
man cleantmp

Hope This Helps, PH.
Want to get great answers to your Tek-Tips questions? Have a look at FAQ219-2884 or FAQ222-2244
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top