Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Westi on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

AIX Bug?

Status
Not open for further replies.

KOG

MIS
Jan 31, 2002
303
GB
Hi Folks,

I have had situations when the oracle listener just hangs intermitently on the server (AIX 4.3.3) and the only solution is to abort the db and reboot the box. No ORA-error messages produced which is rather odd. The question here is - is there a bug in AIX 4.3.3?

Does anyone ever experience this case and if so any idea why?

We have approx 80 concurrent users logged on daily (that is the max number), all from all sorts of front end applications such as web based application, application via sql server, access, excel and so on.

The number of processes on parameter file is 300 and the max no of processes per user on AIX is 256 and the max number of maxuproc requests is 16384.

It is difficult to pinpoint the cause of the listener problem and I am finding it difficult how to marry the number of processes on both oracle and AIX server.

Any feedbacks would be greatly appreciated.

Thanking you all in advance.

Regards

Katherine
 
Hi,

If the listener hang, why do you need to stop database and reboot the server instead of simply restarting the listener ?

Do you get any error message in errpt (something like "SOFTWARE PROGRAM ABNORMALLY TERMINATED") or a core dump in your filesystem ?

When you add all processes owned by oracle ("ps -fu oracle_dba |wc -l", ie db processes+clients+aioservers+"all you get if dba(s) is(are) connected with CDE"+...), can you exceed 256 ?
 
Hi

When the listener hangs, I cannot stop it at all nor get the status response of it. That is why the only option here is to reboot the box which seems the best solution.

Yes I do get this error message and I have just noticed a core file (should that be removed?), so is that a case of the AIX maxuproc being too low?

Regards

Katherine
 
Sometimes we have this kind of problem.

We just kill the process of the listener.
or kill the active session to the instance to find who block
oraclexxxx (DESCRIPTION=(LOCAL=no)(ADDRESS=PROTOCOL=BEQ)))

In this case tnsping don't give responds.

After we do lsnrctl start listener_name ..

regards.
Pascal.
 
Have you checked to make sure you're not running out of some other resource? Paging comes to mind: when paging gets extremely low, you start seeing symptoms of unresponsiveness.
 
hi KOG,

We too found same kind of problem, we did shutting down the database and restart.

Any developments from your end??

Thanks in advance.
aixnag aixnag
IBM Certified Specialist - P-series AIX 5L Administration
IBM Certified Specialist - AIX V4 HACMP
 
Hi

Paging space is now abt 45% used, I have increased maxuproc and aio settings few months ago and all seems fine until last week it hung again. Now I am getting worried and I do not know what to do at this stage .. should I increase the processes number (it is 256 now). It is hard to know what value to add to the settings (I am still new to AIX).

What did you normally do when it happens again, I do not like the idea of aborting the database and rebooting the box over again and again.

Regards

Katherine
 
Hi folks,

What happens if I increase maxuproc with a high value? Just wanted to know the side effects as at the moment it is now 256 and I plan to increase it to 512, is that wise or too much? Oracle processes value is now 300.

It is so hard working out the optimical value for maxuproc.

Any suggestions.

Cheers

Katherine [santa]
 
I would characterize 45% paging utilization as high. Add more memory, or reduce the memory load (aio, maxuproc, decrease Oracle's SGA, etc).

When paging utilization starts getting high, AIX starts getting very tight with resources, including barring new processes from starting, memory from being allocated, and things like that. Increasing maxuproc in a situation like that won't help since AIX is preventing new process creation for other reasons.
 
Hi Chapter11

How do I add more memory to paging? Does that mean adding more paging spaces through smitty? Should I add it into root vg or oracledata vg?

Page Space Physical Volume Volume Group Size %Used Active Auto Type
hd6 hdisk0 rootvg 912MB 44 yes yes lv

As this is something I have never done before, should that be done outside working hours?

Is it ideal to add another paging space the same size as hd6?

Sorry for all these ques but need to get all info before I start creating another paging space.

Many thanks for answering my ques.

Much appreciated.

Regards

Katherine
 
Paging space is easiest added through smitty. You have the choice of extending an existing paging space, or creating a new one. In either case, it is measured in PPs, not megs.

I personally hold the opinion that all paging space should be in rootvg. If a paging space is in another volume group, and that volume group goes offline or has other trouble that results in the paging space becoming physically unavailable, the system crash-halts immediately. It may cause problems during boot as well if the paging space is expected to be available, but is not, but I am not certain as I've kept myself from ever getting into that situation.

Paging space can be added live and hot, but it cannot be similarly decreased. You can only reduce paging space by rebooting.

The rule of thumb on sizing multiple paging spaces that I was given in some IBM seminar or another is that it is best if all paging spaces are of the same size. The algorithm that handles paging space allocation accounts only for the number of paging spaces, not their sizes as compared to each other. The end result is somewhat inefficient searching if paging spaces are of dissimilar sizes.

Adding additional paging space should only be considered a stopgap measure. The long-term fix is to either increase physical memory, or reduce demand on memory.
 
Hi

Checking our system configurations, I have just realised there is only one 9gb disk for rootvg and 5*9gb disks for oracledata, is it ok to store one paging space on the least active disk within oracle data vg? Or should it be within rootvg?

Regards

Katherine
 
Hi folks

Is there a simple command to check how much real memory is being used?

Regards

Katherine
 
Hi,

Quite interesting since my last visit. I must say I feel very interested in your concern, as I experiment quite the same (memory usage).

1) Check how much real memory is being used:
svmon -F (displays in frames)

2) Paging space
Oracle uses shared memory, which is pre-allocated in paging space, even if the frames aren't used (even if lot of people keep on arguing on it, that's the last & most sense answer I could get from IBM and Oracle). That's why we often hear we should have RAM size=swap size.

If you use "lsps -s", you have an idea on how much paging space is allocated or even used, but nothing on paging activity. Use vmstat instead.

So, with Oracle databases (according to Oracle), your system can hang when your paging space is full, even if you still have unused physical memory, as Aix can't allocated new frames for new processes in a fully unsued paging space. How much physical memory do you have ?

3) Core file
You can remove it (unless you know what to do with it). Which process crashes ?

4) As Chapter11 already said, are you sure you're not running out of some resources ? Can you monitor system activity with vmstat, iostat, topas in normal activity and before system hangs ? Can it be a disk IO issue (I say something stupid, but I've already seen a listener locked on its log file in a 100% used filesystem, that gets cleared at startup ...) ?
 
nmon is very helpful:

nmon v6g Hostname=s29ix100 Refresh=2.0secs 12:00.40
Memory Use Physical Virtual Paging pages/sec In Out VM parameters
% Used 100.0% 2.5% to Paging Space 0.0 0.0 numperm 84.6%
% Free 0.0% 97.5% to File System 0.0 2957.2 minperm 5.0%
MB Used 2047.3MB 189.5MB Page Scans 3268.9 maxperm 10.0%
MB Free 0.6MB 7362.5MB Page Cycles 0.0 minfree 120
Total(MB) 2048.0MB 7552.0MB Page Reclaim 0.0 maxfree 128


this machine is pretty idle right now, just restoring the Oracle db right now. Just to compare, the data filesystems on this machine add up to 546GB total so your numbers may vary, but you see "Physical" shows your RAM while "Virtual" shows RAM plus paging.

hth IBM Certified -- AIX 4.3 Obfuscation
 
Hi folks,


Is it me or is there missing filesets on the server as I cannot seem to run any of the above commands using root user i.e. svmon or nmon?

Can u pls help me?

Cheers

K
 
Katherine,

This isn't going to answer your original question about listener hanging, but I just wanted to point out that if you have your Oracle processes set to 300 and your maxuproc for the OS set to 250, your Oracle processes are going to stop at 250 because all those database processes are used by your oracle user -- not each of the app users.

I have maxuproc set to 1000 on an H80 with 8 GB of RAM and 6 CPUs and it's been like that for ages.

Regarding putting paging on non-system disks: yes, you can do that with no problem.

Take a look at your vmtune parameters. Check the man pages for vmtune. IBM also has a Red Book on tuning AIX and that book has a chapter on tuning for Oracle databases. That explains some things to look into.

We started having a similar problem with listener core dumping. We are still having it intermittently and are still trying to find the solution. I have read that adding paging space should help, but we still core dumped after doubling our paging space -- but it did take longer to get to the core dump.

Some info on the Oracle metalink page has made me think it might be related to our beginning to use rman. Our problems started right after our DBA instituted rman.
 
Hi

Can you pls tell me which directory should I download and run nmon7.tar file, it does not say on the documentation.

Should it go into /usr/bin directory?

Regards

KOG
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top