Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

cent os 4.5 with 16 gigs of ram

Status
Not open for further replies.

paublo

ISP
Sep 14, 2006
127
US
I have a cent os 4.5 server that has 16 gigs of ram using more the one cpu. Until now the server was running great but I started running out of memory.

I’m using kernel 2.6.9-55.ELsmp and it sees all of my 16 gigs of ram however something seems off like its not efficiently using all available memory.

Mem: 16633348k total, 16604084k used, 29264k free, 278356k buffers
Swap: 2031608k total, 1844k used, 2029764k free, 15522468k cached

I have a cent os 5.0 server also with 16 gigs of ram using a PAE kernel and I have 2 gigs free, so the above memory usage doesn’t seem right even though im not using a PAE like kernel on my cent os 4.5 server.

I was looking around with yum and did notice kernel-hugemem.i686 for cent os 4.5. I’m now wondering if I’m using the wrong kernel and should be using kernel-hugemem.i686 instead of 2.6.9-55.ELsmp. If so I’m assuming kernel-hugemem.i686 is compatible with more than one cpu.

Any thoughts?

TIA, paul

Heres what I was seeing in logs:

Jan 8 07:35:27 pop kernel: 4390912 pages of RAM
Jan 8 07:35:27 pop kernel: 3964840 pages of HIGHMEM
Jan 8 07:35:27 pop kernel: 232575 reserved pages
Jan 8 07:35:27 pop kernel: 1774881 pages shared
Jan 8 07:35:27 pop kernel: 4445 pages swap cached
Jan 8 07:35:27 pop kernel: Out of Memory: Killed process 27053 (httpd).
Jan 8 07:35:27 pop kernel: oom-killer: gfp_mask=0xd0
Jan 8 07:35:27 pop kernel: Mem-info:
Jan 8 07:35:27 pop kernel: DMA per-cpu:
Jan 8 07:35:27 pop kernel: cpu 0 hot: low 2, high 6, batch 1
Jan 8 07:35:27 pop kernel: cpu 0 cold: low 0, high 2, batch 1
Jan 8 07:35:27 pop kernel: cpu 1 hot: low 2, high 6, batch 1
Jan 8 07:35:27 pop kernel: cpu 1 cold: low 0, high 2, batch 1
Jan 8 07:35:27 pop kernel: cpu 2 hot: low 2, high 6, batch 1
Jan 8 07:35:27 pop kernel: cpu 2 cold: low 0, high 2, batch 1
Jan 8 07:35:27 pop kernel: cpu 3 hot: low 2, high 6, batch 1
Jan 8 07:35:27 pop kernel: cpu 3 cold: low 0, high 2, batch 1
Jan 8 07:35:27 pop kernel: Normal per-cpu:
Jan 8 07:35:27 pop kernel: cpu 0 hot: low 32, high 96, batch 16
Jan 8 07:35:27 pop kernel: cpu 0 cold: low 0, high 32, batch 16
Jan 8 07:35:27 pop kernel: cpu 1 hot: low 32, high 96, batch 16
Jan 8 07:35:27 pop kernel: cpu 1 cold: low 0, high 32, batch 16
Jan 8 07:35:27 pop kernel: cpu 2 hot: low 32, high 96, batch 16
Jan 8 07:35:27 pop kernel: cpu 2 cold: low 0, high 32, batch 16
Jan 8 07:35:27 pop kernel: cpu 3 hot: low 32, high 96, batch 16
Jan 8 07:35:27 pop kernel: cpu 3 cold: low 0, high 32, batch 16
Jan 8 07:35:27 pop kernel: HighMem per-cpu:
Jan 8 07:35:27 pop kernel: cpu 0 hot: low 32, high 96, batch 16
Jan 8 07:35:27 pop kernel: cpu 0 cold: low 0, high 32, batch 16
Jan 8 07:35:27 pop kernel: cpu 1 hot: low 32, high 96, batch 16
Jan 8 07:35:27 pop kernel: cpu 1 cold: low 0, high 32, batch 16
Jan 8 07:35:27 pop kernel: cpu 2 hot: low 32, high 96, batch 16
Jan 8 07:35:27 pop kernel: cpu 2 cold: low 0, high 32, batch 16
Jan 8 07:35:27 pop kernel: cpu 3 hot: low 32, high 96, batch 16
Jan 8 07:35:27 pop kernel: cpu 3 cold: low 0, high 32, batch 16
Jan 8 07:35:27 pop kernel:
Jan 8 07:35:27 pop kernel: Free pages: 215200kB (201728kB HighMem)
Jan 8 07:35:27 pop kernel: Active:3067716 inactive:837089 dirty:1 writeback:0 unstable:0 free:53800 slab:185033 mapped:90592 pagetables:9117

 
This comes up pretty often..

The line you want to worry about is:
1844k used, 2029764k free, 15522468k cached

The "cached" is essentially free for applications that is currently being used for disk cache. The "used" is the "real" RAM in use, very little of 16GB

D.E.R. Management - IT Project Management Consulting
 
the line you are looking at is for the swap space, which i notice is fine but the physical memory is low to the point were the server was out of memory.

i shouldn't be seeing oom killing processes if everything is ok, correct ?
 
OK, sorry, I was in a hurry and I should answer a bit more carefully.

First, it is probably a misconfiguration to have 2GB of swap space for 16GB of RAM. Normally the guidance on disk swap space is 1.5X physical RAM. That MIGHT serve to ease the problem, but it's still peculiar.

Next, You have 15+GB of cached disk content. This is pretty high.

The hugemem compiled kernels are designed to address BEYOND 16GB RAM, but are by defauly supposed to be SMP enabled (multiple CPUs). You could try stepping up to the hugemen kernel for funzies and see if things stabilize

Are you running any virtualization or other "special" processes that we should know about? That kind of caching is pretty extreme. Are you running software RAID?


D.E.R. Management - IT Project Management Consulting
 
thanks for the reply.

originally we started with 2 gig or ram thus the swap of 2 gigs.

I'm not running anything special, just dovecot pop3/imap postfix http and mysql. the only real traffic is on 110/http, no virtual stuff.

Does the below look bad aside from the high cached disk?
oom seems to think it is

Any thoughts on what i can do ?

currently im seeing

total used free shared buffers cached
Mem: 16633348 16536564 96784 0 244656 15462048
-/+ buffers/cache: 829860 15803488
Swap: 2031608 1844 2029764

 
So that shows me:

16GB found. 16GB+ "used", 15+GB cached

Real application/kernel space is 829MB

Nearly no swap space is used.

Are you actually having any problems? This looks like a healthy system being agressive in using RAM for disk cache. This is normal.
As you load applications, the cache space should reduce and the app/kernel space should increase.

Am I missing something?


D.E.R. Management - IT Project Management Consulting
 
Presuming the CentOS hugemem is similar to Red Hat's hugemem kernel, I would recommend you use it. Under the 2.4 kernels Red Hat recommended you switch to hugemem with anything above 6GB of RAM; I'm not sure whether that figure holds true for the 2.6 kernels, but I'd say with 16GB the hugemem kernel is a safe bet.

thedaver said:
First, it is probably a misconfiguration to have 2GB of swap space for 16GB of RAM. Normally the guidance on disk swap space is 1.5X physical RAM. That MIGHT serve to ease the problem, but it's still peculiar.

I think this rule isn't really true any more with the relatively massive amounts of memory systems have these days. Personally I would never configure more than 4GB of swap on any system, and would certainly be looking for another solution rather than adding swap if it was filling up.

The only other consideration with heavy buffer cache usage over a long period of uptime is memory fragmentation. Buffer cache memory can be considered part of your 'free' memory, as reported by the third line of the output of the free command, however I have encountered problems where systems with plenty of so-called free memory have had memory allocation failures, usually with Oracle, when memory was severely fragmented and it was unable to satisfy the memory allocation request using large enough chunks. Again this was under a 2.4 kernel, I hope this has been improved in 2.6!

You can check the fragmentation level by using the showMem facility of SysRq.

Annihilannic.
 
SOLVED.

it looks like lower memory, free -lm was down to zero and causing OOM to kill processes.

I did echo "250" > /proc/sys/vm/lower_zone_protection and it took care of the issue.

hope this helps someone.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top