Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Error=24 Too many open files 1

Status
Not open for further replies.

tcraig

Technical User
Oct 5, 2001
12
US
I have recently started getting "Error=24 Too many open files" messages sporadically. When this error occurs I am forced to reboot. I thought the max number of open files was controlled by the /etc/security/limits/nofiles setting, but I have both my hard and soft set to unlimited.

<xke>ulimit -Sa
time(seconds) unlimited
file(blocks) unlimited
data(kbytes) 131072
stack(kbytes) 32768
memory(kbytes) 32768
coredump(blocks) 2097151
nofiles(descriptors) unlimited
<xke>
<xke>ulimit -Ha
time(seconds) unlimited
file(blocks) unlimited
data(kbytes) unlimited
stack(kbytes) unlimited
memory(kbytes) unlimited
coredump(blocks) unlimited
nofiles(descriptors) unlimited

Is there something else that I should be looking at? Any assistance would be greatly appreciated.
 
Where are you seeing this errpr message ? That does not look like the standard AIX xxxx:yyyy error message.

BV
 
Users get the error message when trying to initiate programs that open files. I contacted Unidata, my database supplier with the error first and they insist that it is being generated by AIX. No errors are being recorded in either the database or AIX.
 
I see that I spelled 'error' as 'errpr' - sorry...

According to /usr/include/sys/errno.h:

#define EMFILE 24 /* Too many open files */

Indeed, this error is coming from AIX.

I've seen conditions where setting the Hard Limit actually causes problems.

As well, if you change /etc/security/limits, the change needs to be by user-id, and once changed, the user-id that was changed needs to be logged off (all applications/programs stopped) then restarted/relogged in. Or optionally, you can reboot - but that is not necessary if you can recycle the application and user(s) that are logged in under that user-id.

If you change just 'root' and your application runs as a separate user-id, then changing root will have no impact. You need to change the stanza for the application's user-id or the 'default' stanza.

BV
 
I already have /etc/security/limits

default:
fsize = -1
core = -1
cpu = -1
data = -1
rss = -1
stack = -1
nofiles = -1

Do I need to change something other than this. If I login to our application with a generic user name and
!ulimit -Ha I get the following:

:!ulimit -Ha
time(seconds) unlimited
file(blocks) unlimited
data(kbytes) unlimited
stack(kbytes) unlimited
memory(kbytes) unlimited
coredump(blocks) unlimited
nofiles(descriptors) unlimited


 
But again - the last part of my revious post...was the application bounced and users logged off after the /etc/security/limits was changed ?

BV
 
Sorry, forgot to put that detail in here. The limits have been set this way since our last upgrade which was approximately 1 year ago. We do weekly shutdowns so the programs have all been cycled continuously. Is this what you meant by "bounced"?
 
Ok, then. We bounce it once a week.
 
Just for grins, run this command:

lsattr -El sys0 -a maxuproc

Perhaps the program is running out of available processes.

I have mine set to a minumum of 500.

BV
 
<xke>lsattr -El sys0 -a maxuproc
maxuproc 128 Maximum number of PROCESSES allowed per user True

What does PROCESSES control and how can I change it?
 
That sets the number of processes allowed per user-id. These are programs that show up in the "ps" command.

Perhaps there is a chance that the application is trying to 'fork' a new program, and it can't. So it see's it as a file limit. I don't know - it is a stretch.

chdev -l sys0 -a maxuproc=500

Is what I run on this particular host.

BV
 
Thanks for your help. I'll try this and see if it makes a difference.

TAC
 
This same issue bit me a few times too. Talk with your users and see if 500 is a good limit. I've got an Intersystems Cache database that has forced me to set this value at 4096.
 
Thanks for your assistance before. We had the "Too many open files" error again with maxuproc=500. While the error was still occurring I set maxuproc=4096. The error went away and users were able to function.

My problem with this is that there were not entries in the ps -ef to justify hitting the 500 limit. The total number of processes system wide was only 370 (ps -ef). There was nothing in the errpt either. Memory, paging and disk space all seemed to be fine.

Am I looking in the wrong place for these processes? How can I determine what caused the system to hit the maxuproc?

Thanks in advance!

 
It is possible (likely) that there are processes spawning that you cannot see because they are short lived.

Other than that, it could relate to other increased memory pools related to the increase of maxuproc.

I have no definitive answer.

But at least the problem is resolved. This parameter will remain constant across reboots, so you do not need to worry about setting it after a reboot. Be sure you document this setting somewhere in your procedures, should it come up again.

BV
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top