Output of free -m:
total used free shared buffers cached
Mem: 96741 50318 46422 0 21 44160
-/+ buffers/cache: 6136 90605
Swap: 90111 3 90107
As you can see here, the system is only using half the free memory for cache, and leaving the other half free. This would be normal behavior if only half the cache were needed, but IOstat also showed numerous and frequent reads from disk, resulting in IOwaits for user queries. Still, there could be other explanations for that.
So, I tried forcing a cache fill by doing a pgdump. This caused the cache to mostly fill free memory -- but then Linux aggressively cleared the cache, again getting it down to around 40GB of cache within a few minutes. This seemed to be the case no matter what we did, including tinkering with the vm parameters, increasing the size of the swap file, and changing shared_buffers. This was highly peculiar; it was as if Linux was convinced that we had half as much RAM as we did.
What fixed the problem was changing the kernel version. It turns out that kernel
2.6.32-71.29.1.el6.x86_64, released by Red Hat during a routine update, has some kind of cache management issue which can't be fixed in user space. Fortunately, they now have a later kernel version out as an update.
Before:
[root ~]# free -g
total used free shared buffers cached
Mem: 94 24 70 0 0 19
[root ~]# uname -a
Linux server1.company.com 2.6.32-71.29.1.el6.x86_64 #1 SMP Mon
Jun 27 19:49:27 BST 2011 x86_64 x86_64 x86_64 GNU/Linux
After:
[root ~]# free -g
total used free shared buffers cached
Mem: 94 87 6 0 0 83
[root ~]# uname -a
Linux server1.company.com 2.6.32-220.4.2.el6.x86_64 #1 SMP Tue
Feb 14 04:00:16 GMT 2012 x86_64 x86_64 x86_64 GNU/Linux
That's more like it! Thanks to Andrew Kerr of Tigerlead for helping figure this issue out.
I don't know if other Linux distributors released the same kernel with any routine update. I haven't seen this behavior (yet) with Ubuntu, Debian, or SuSE. If you see it, please report it in the comments, or better to the appropriate mailing list.
Do you know which exact version fixed this?
ReplyDelete2.6.32-131.0.15.el6.x86_64 doesn't seem to have this problem so it must have been fixed before that.
ReplyDeleteThanks! That narrows it down even further.
DeleteNow the other question is when the problem was introduced ...
The buffers and caches can be cleared using these commands (at least in Linux 2.6.16 and later kernels):
ReplyDeleteTo free pagecache:
echo 1 > /proc/sys/vm/drop_caches
To free dentries and inodes:
echo 2 > /proc/sys/vm/drop_caches
To free pagecache, dentries and inodes:
echo 3 > /proc/sys/vm/drop_caches
Matias Colli
RHCSA
Of course, i always use echo 3 > /proc/sys/vm/drop_caches
ReplyDeleteMatias Colli
RHCSA
Josh, do you know the model of CPUs with which this issue was seen? amd/intel? I have got a similar case on intel, amd does work well. But it could be just a coincidence.
ReplyDeletewas there any NFS ?
ReplyDeleteanother one http://comments.gmane.org/gmane.comp.db.sqlite.general/79457
ReplyDeleteI just found relation between free unused memory and a count of active process. ~1GB per backend.
ReplyDeletein 376GB total memmory and 32 core
if ( user cpu + io wait ) is 145% then i have ~140GB free