Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

IBM p570 -

Status
Not open for further replies.

teqmem

Programmer
Nov 26, 2004
114
US
Hello,

I have an IBM p570 with AIX 5.2. It runs Oracle 10g.

I have this problem that I haven't been able to pinpoint what's causing the problem.

The problem is that it just stop responding on all ports (sqlnet, telnet, ssh, ftp, etc). And when I try to login on the HMC console, I get this error and wouldn't allow me to proceed:
Code:
  Could not load prog -ksh
  Could not load /usr/lib/libc (shr.o)
    Dependent module libcrypt.a (shr.o) could not be loaded.
  Could not load module libcrypt.a (shr.o)
  System error:  Not enough space

but I know that all the file systems have enough space.

I then would have the use the "Server Management" on the HMC console to force it to restart. After a restart and without any changes, it works without problem.

I have looked at the error log but each time, I didn't see any corresponding error log entry (all entries were old).

Do you have any ideas on how I can isolate this problem?

Thank you.
 
which ML of AIX are you on?
I'd start looking at memory consumage.Perhaps a kernel memory leak,or oracle user sessions that don't get closed,so memory slowly gets eaten.Be sure to have the number of processes high enough per user (oracle,root ),and have enough memory available for your 10g.

rgds,

R.
 
I have the same environment as you teqmem but with 5.3!

I had an error once similar to yours but different error returned and it was because of memory shortage!

I believe you have to have minimum 3G of RAM installed.

How big is your RAM?
 
Can also be down to ulimit try changing /etc/security limits to -1 for the root & oracle users


or

ulimit -s unlimited
ulimit -d unlimited

Mike

"Whenever I dwell for any length of time on my own shortcomings, they gradually begin to seem mild, harmless, rather engaging little things, not at all like the staring defects in other people's characters."
 
Like the rest of the posts, I would check ram and paging space.

topas
svmon -G
svmon -Put 5

all should be helpful.
 
I have enough ram and paging as shown below:

Code:
/root> svmon -G
               size      inuse       free        pin    virtual
memory      2007040    1632231     374809     153158     689595
pg space     524288      45264

               work       pers       clnt      lpage
pin          153158          0          0          0
in use       650899       5579     975753          0

/root> bootinfo -r
8028160

My OS is 5200-05.

I'll look into the other suggestions. Thanks.
 
You should look at this periodically.

Oracle, especially 10g can eat up RAM in a hurry. You may have plenty now, but that doesn't mean you did when you were getting those errors.

Any fork errors at the time you were getting there errors?
 
I'd up the paging to 20GB at least.


HTH,

p5wizard
 
In response to mrn, the ulimits are already set to -1 for both root and oracle.
 
I'm with hfax, I think some thing is taking all the memory.
If you have AIX support raise a PerfPMR when you next find the system non responsive. When you restart it tell it to take a dump and get it analysed.
Make sure you have enough dump space first....
 
what is the output from ipcs -ma add the SEGSZ together it shoiuld not exceed your real memory you will be able to see where the memory segments are . Also check the mod column see if any segments are marked for Deletion and still taking memory
 
Just to check on the paging, I run a constant monitor of the paging space for several days and found that usage didn't go above 28%. But then I didn't have the problem during this period so I would classify my finding to be inconclusive.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top