Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations TouchToneTommy on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Large Amount of Page Faults 3

Status
Not open for further replies.

rroevz

Technical User
Sep 2, 2005
2
CH
Hi

On our P570 Oracle DB-Server we have a large amount of page faults.
Is there a probleme in memory tunning of vmo settings?
Or whats the reason for this large amount of faults.
And whats the meaning of odio/s.

[tt]
power2: # sar -r 2 5

AIX power2 3 5 00C255BE4C00 09/02/05

System configuration: lcpu=24 mem=40960MB ent=10.90

11:22:23 slots cycle/s fault/s odio/s
11:22:25 1949325 0.00 348821.67 3272.91
11:22:27 1949494 0.00 610265.69 2002.94
11:22:30 1949408 0.00 393747.32 2932.20
11:22:32 1949513 0.00 599087.25 1804.41
11:22:34 1949328 0.00 85013.47 3148.65

nmon v10 output
Physical PagingSpace pages/sec In Out FileSystemCache
% Used 99.8% 7.0% to Paging Space 0.0 0.0 (numperm) 69.2%
% Free 0.2% 93.0% to File System 91.5 2033.5 Process 22.9%
MB Used 40882.6MB 576.1MB Page Scans 7671.4 System 7.7%
MB Free 77.4MB 7615.9MB Page Cycles 0.0 Free 0.2%
Total(MB) 40960.0MB 8192.0MB Page Steals 4050.3 ------
Page Faults 590385.7 Total 100.0%
Min/Maxperm 3908MB( 10%) 23449MB( 57%) note: % of memory
Min/Maxfree 1320 2728 Total Virtual 48.0GB User 26.2%
Min/Maxpgahead 2 128 Accessed Virtual 12.3GB 25.6% Pinned 8.2%

power2: # vmo -a
cpu_scale_memp = 8 minfree = 1320
data_stagger_interval = 161 minperm = 1000479
defps = 1 minperm% = 10
force_relalias_lite = 0 nokilluid = 0
framesets = 2 npskill = 16384
htabscale = n/a npsrpgmax = 131072
kernel_heap_psize = 4096 npsrpgmin = 98304
large_page_heap_size = 0 npsscrubmax = 131072
lgpg_regions = 0 npsscrubmin = 98304
lgpg_size = 0 npswarn = 65536
low_ps_handling = 1 num_spec_dataseg = 0
lru_file_repage = 1 numpsblks = 2097152
lru_poll_interval = 0 page_steal_method = 1
lrubucket = 131072 pagecoloring = n/a
maxclient% = 60 pinnable_frames = 9628940
maxfree = 2728 pta_balance_threshold = n/a
maxperm = 6002879 relalias_percentage = 0
maxperm% = 60 rpgclean = 0
maxpin = 8388608 rpgcontrol = 2
maxpin% = 80 scrub = 0
mbuf_heap_psize = 4096 scrubclean = 0
memory_affinity = 1 soft_min_lgpgs_vmpool = 0
memory_frames = 10485760 spec_dataseg_int = 512
memplace_data = 2 strict_maxclient = 0
memplace_mapped_file = 2 strict_maxperm = 0
memplace_shm_anonymous = 2 v_pinshm = 0
memplace_shm_named = 2 vm_modlist_threshold = -1
memplace_stack = 2 vmm_fork_policy = 1
memplace_text = 2
memplace_unmapped_file = 2
mempools = 1

[/tt]

Thanks for Infos.
Rolf



 
You have a large server (40GB memory). Your vmo values allow the server to use up to 60% as file I/O cache (maxperm% and maxclient% both 60%).

If your Oracle instance is properly configured, you are probably buffering your file I/O twice, once by LVM in AIX memory and once in your Oracle SGA buffers. Not a good idea since you are pushing some of the other memory usage (processes) to paging space.

I would lower the maxperm% and maxclient% to free up memory for computational pages. 24GB of file I/O cache - that's overdoing it a bit...

Also your database is fairly active - more writes than reads, and Oracle is pushing dirty file blocks back to the filesystems, but they are justy being pushed to another memory segment and will wait in there until LVM decides to flush them to the disks.

(I have large DB servers with less memory than you have in your file I/O buffer...)

HTH,

p5wizard
 
I agree with P5wizard on the cause. I would go about solving the problem a bit differently. If the database is on JFS2 you might want to consider mounting the filesystems as CIO. It is new to JFS2 and was created because of this issue. In short CIO has all the benefits of Direct IO (DIO) without the inode locking. If you can't use JFS2 and CIO then one thing that you need to do is lower your read-ahead settings on your filesystem. This will at least reduce the amount of wasted memory. Like I said CIO gets close to the speeds of RAW so it would be a good idea to check into it. Might also want to read this document. It's one of the better ones I have found.


TCorum
 
I suggest you tune your "maxperm" parameter
whit this command
/usr/samples/kernel/vmtune -P10 -p 5 -h1 -t 10
to reduce memory for filesystem cache.
 
The best solution was to reduce the maxperm% and the maxclient%. Now it's the page fault rate about hundred times smaller than befor. I reduce from 60% to 30%.

I think the vmtune will be quit the same solution but I use AIX 5.3 and vmo.

We use JFS2, an i will remeber your sugest about cio.

Thanks at all

rroevz
 
Are we all looking at the same original post?

rroevz: are you experiencing a performance problem? If so, post vmstat and iostat samplings.

The output posted doesn't show a machine with a problem, unless that problem is that it's insanely oversized for the work it's doing.

It's only using 576MB of paging space, probably entirely consisting of daemons, gettys, and various rarely used components of applications. nmon shows that it's not doing the bad kind of paging (to paging space) at all. The numbers for paging to file systems are just good old fashion writes to disk. AIX uses the same paging mechanism for virtual memory and file reads and writes, that's why nmon shows two seperate classes of page.

The fact that maxperm=60 and actual perm=69 shows that there's no need for any more computational page space in real memory. Since 9% of 40GB is about 3.6GB, this confirms that the paged out processes aren't needed. If the were, they'd automatically have priority over the file cache pages above the 60% maxperm mark.

odio/s shows the number of page faults per second that actually require i/o. The ratio of fault/s to odio/s equates to the file cache hit/miss ratio, and this machine's is phenomenal.

332:1? I'm salivating here.

Your vmstat must show a constant 0 in the "wa" column.

As to dirty file pages, if you're not using AIO the longest a dirty page can go without being written to disk is the period of your syncd process (default 60 seconds). If Oracle uses AIO, then writes are being delayed intentionally to increase performance, and Oracle should, and most likely does, have other measures in place to allow recovery of the unwritten data after a full system crash.

Rod Knowlton
IBM Certified Advanced Technical Expert pSeries and AIX 5L
CompTIA Linux+
CompTIA Security+

 
This is one of the best discussions of memory use that I have ever seen. I want to thank you all for the wonderful info conveyed herein!
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top