Soory i mistaken,here is right string INFO [main] 2012-06-14 02:03:14,520 CLibrary.java (line 109) JNA mlockall successful
2012/6/15 ruslan usifov <ruslan.usi...@gmail.com>: > 2012/6/14 Gurpreet Singh <gurpreet.si...@gmail.com>: >> JNA is installed. swappiness was 0. vfs_cache_pressure was 100. 2 questions >> on this.. >> 1. Is there a way to find out if mlockall really worked other than just the >> mlockall successful log message? > yes you must see something like this (from our test server): > > INFO [main] 2012-06-14 02:03:14,745 DatabaseDescriptor.java (line > 233) Global memtable threshold is enabled at 512MB > > >> 2. Does cassandra only mlock the jvm heap or also the mmaped memory? > > Cassandra obviously mlock only heap, and doesn't mmaped sstables > > >> >> I disabled mmap completely, and things look so much better. >> latency is surprisingly half of what i see when i have mmap enabled. >> Its funny that i keep reading tall claims abt mmap, but in practise a lot of >> ppl have problems with it, especially when it uses up all the memory. We >> have tried mmap for different purposes in our company before,and had finally >> ended up disabling it, because it just doesnt handle things right when >> memory is low. Maybe the proc/sys/vm needs to be configured right, but thats >> not the easiest of configurations to get right. >> >> Right now, i am handling only 80 gigs of data. kernel version is 2.6.26. >> java version is 1.6.21 >> /G >> >> >> On Wed, Jun 13, 2012 at 8:42 PM, Al Tobey <a...@ooyala.com> wrote: >>> >>> I would check /etc/sysctl.conf and get the values of >>> /proc/sys/vm/swappiness and /proc/sys/vm/vfs_cache_pressure. >>> >>> If you don't have JNA enabled (which Cassandra uses to fadvise) and >>> swappiness is at its default of 60, the Linux kernel will happily swap out >>> your heap for cache space. Set swappiness to 1 or 'swapoff -a' and kswapd >>> shouldn't be doing much unless you have a too-large heap or some other app >>> using up memory on the system. >>> >>> >>> On Wed, Jun 13, 2012 at 11:30 AM, ruslan usifov <ruslan.usi...@gmail.com> >>> wrote: >>>> >>>> Hm, it's very strange what amount of you data? You linux kernel >>>> version? Java version? >>>> >>>> PS: i can suggest switch diskaccessmode to standart in you case >>>> PS:PS also upgrade you linux to latest, and javahotspot to 1.6.32 >>>> (from oracle site) >>>> >>>> 2012/6/13 Gurpreet Singh <gurpreet.si...@gmail.com>: >>>> > Alright, here it goes again... >>>> > Even with mmap_index_only, once the RES memory hit 15 gigs, the read >>>> > latency >>>> > went berserk. This happens in 12 hours if diskaccessmode is mmap, abt >>>> > 48 hrs >>>> > if its mmap_index_only. >>>> > >>>> > only reads happening at 50 reads/second >>>> > row cache size: 730 mb, row cache hit ratio: 0.75 >>>> > key cache size: 400 mb, key cache hit ratio: 0.4 >>>> > heap size (max 8 gigs): used 6.1-6.9 gigs >>>> > >>>> > No messages about reducing cache sizes in the logs >>>> > >>>> > stats: >>>> > vmstat 1 : no swapping here, however high sys cpu utilization >>>> > iostat (looks great) - avg-qu-sz = 8, avg await = 7 ms, svc time = 0.6, >>>> > util >>>> > = 15-30% >>>> > top - VIRT - 19.8g, SHR - 6.1g, RES - 15g, high cpu, buffers - 2mb >>>> > cfstats - 70-100 ms. This number used to be 20-30 ms. >>>> > >>>> > The value of the SHR keeps increasing (owing to mmap i guess), while at >>>> > the >>>> > same time buffers keeps decreasing. buffers starts as high as 50 mb, >>>> > and >>>> > goes down to 2 mb. >>>> > >>>> > >>>> > This is very easily reproducible for me. Every time the RES memory hits >>>> > abt >>>> > 15 gigs, the client starts getting timeouts from cassandra, the sys cpu >>>> > jumps a lot. All this, even though my row cache hit ratio is almost >>>> > 0.75. >>>> > >>>> > Other than just turning off mmap completely, is there any other >>>> > solution or >>>> > setting to avoid a cassandra restart every cpl of days. Something to >>>> > keep >>>> > the RES memory to hit such a high number. I have been constantly >>>> > monitoring >>>> > the RES, was not seeing issues when RES was at 14 gigs. >>>> > /G >>>> > >>>> > On Fri, Jun 8, 2012 at 10:02 PM, Gurpreet Singh >>>> > <gurpreet.si...@gmail.com> >>>> > wrote: >>>> >> >>>> >> Aaron, Ruslan, >>>> >> I changed the disk access mode to mmap_index_only, and it has been >>>> >> stable >>>> >> ever since, well at least for the past 20 hours. Previously, in abt >>>> >> 10-12 >>>> >> hours, as soon as the resident memory was full, the client would start >>>> >> timing out on all its reads. It looks fine for now, i am going to let >>>> >> it >>>> >> continue to see how long it lasts and if the problem comes again. >>>> >> >>>> >> Aaron, >>>> >> yes, i had turned swap off. >>>> >> >>>> >> The total cpu utilization was at 700% roughly.. It looked like kswapd0 >>>> >> was >>>> >> using just 1 cpu, but cassandra (jsvc) cpu utilization increased quite >>>> >> a >>>> >> bit. top was reporting high system cpu, and low user cpu. >>>> >> vmstat was not showing swapping. java heap size max is 8 gigs. while >>>> >> only >>>> >> 4 gigs was in use, so java heap was doing great. no gc in the logs. >>>> >> iostat >>>> >> was doing ok from what i remember, i will have to reproduce the issue >>>> >> for >>>> >> the exact numbers. >>>> >> >>>> >> cfstats latency had gone very high, but that is partly due to high cpu >>>> >> usage. >>>> >> >>>> >> One thing was clear, that the SHR was inching higher (due to the mmap) >>>> >> while buffer cache which started at abt 20-25mb reduced to 2 MB by the >>>> >> end, >>>> >> which probably means that pagecache was being evicted by the kswapd0. >>>> >> Is >>>> >> there a way to fix the size of the buffer cache and not let system >>>> >> evict it >>>> >> in favour of mmap? >>>> >> >>>> >> Also, mmapping data files would basically cause not only the data >>>> >> (asked >>>> >> for) to be read into main memory, but also a bunch of extra pages >>>> >> (readahead), which would not be very useful, right? The same thing for >>>> >> index >>>> >> would actually be more useful, as there would be more index entries in >>>> >> the >>>> >> readahead part.. and the index files being small wouldnt cause memory >>>> >> pressure that page cache would be evicted. mmapping the data files >>>> >> would >>>> >> make sense if the data size is smaller than the RAM or the hot data >>>> >> set is >>>> >> smaller than the RAM, otherwise just the index would probably be a >>>> >> better >>>> >> thing to mmap, no?. In my case data size is 85 gigs, while available >>>> >> RAM is >>>> >> 16 gigs (only 8 gigs after heap). >>>> >> >>>> >> /G >>>> >> >>>> >> >>>> >> On Fri, Jun 8, 2012 at 11:44 AM, aaron morton >>>> >> <aa...@thelastpickle.com> >>>> >> wrote: >>>> >>> >>>> >>> Ruslan, >>>> >>> Why did you suggest changing the disk_access_mode ? >>>> >>> >>>> >>> Gurpreet, >>>> >>> I would leave the disk_access_mode with the default until you have a >>>> >>> reason to change it. >>>> >>> >>>> >>>> > 8 core, 16 gb ram, 6 data disks raid0, no swap configured >>>> >>> >>>> >>> is swap disabled ? >>>> >>> >>>> >>>> Gradually, >>>> >>>> > the system cpu becomes high almost 70%, and the client starts >>>> >>>> > getting >>>> >>>> > continuous timeouts >>>> >>> >>>> >>> 70% of one core or 70% of all cores ? >>>> >>> Check the server logs, is there GC activity ? >>>> >>> check nodetool cfstats to see the read latency for the cf. >>>> >>> >>>> >>> Take a look at vmstat to see if you are swapping, and look at iostats >>>> >>> to >>>> >>> see if io is the problem >>>> >>> http://spyced.blogspot.co.nz/2010/01/linux-performance-basics.html >>>> >>> >>>> >>> Cheers >>>> >>> >>>> >>> ----------------- >>>> >>> Aaron Morton >>>> >>> Freelance Developer >>>> >>> @aaronmorton >>>> >>> http://www.thelastpickle.com >>>> >>> >>>> >>> On 8/06/2012, at 9:00 PM, Gurpreet Singh wrote: >>>> >>> >>>> >>> Thanks Ruslan. >>>> >>> I will try the mmap_index_only. >>>> >>> Is there any guideline as to when to leave it to auto and when to use >>>> >>> mmap_index_only? >>>> >>> >>>> >>> /G >>>> >>> >>>> >>> On Fri, Jun 8, 2012 at 1:21 AM, ruslan usifov >>>> >>> <ruslan.usi...@gmail.com> >>>> >>> wrote: >>>> >>>> >>>> >>>> disk_access_mode: mmap?? >>>> >>>> >>>> >>>> set to disk_access_mode: mmap_index_only in cassandra yaml >>>> >>>> >>>> >>>> 2012/6/8 Gurpreet Singh <gurpreet.si...@gmail.com>: >>>> >>>> > Hi, >>>> >>>> > I am testing cassandra 1.1 on a 1 node cluster. >>>> >>>> > 8 core, 16 gb ram, 6 data disks raid0, no swap configured >>>> >>>> > >>>> >>>> > cassandra 1.1.1 >>>> >>>> > heap size: 8 gigs >>>> >>>> > key cache size in mb: 800 (used only 200mb till now) >>>> >>>> > memtable_total_space_in_mb : 2048 >>>> >>>> > >>>> >>>> > I am running a read workload.. about 30 reads/second. no writes at >>>> >>>> > all. >>>> >>>> > The system runs fine for roughly 12 hours. >>>> >>>> > >>>> >>>> > jconsole shows that my heap size has hardly touched 4 gigs. >>>> >>>> > top shows - >>>> >>>> > SHR increasing slowly from 100 mb to 6.6 gigs in these 12 hrs >>>> >>>> > RES increases slowly from 6 gigs all the way to 15 gigs >>>> >>>> > buffers are at a healthy 25 mb at some point and that goes down >>>> >>>> > to 2 >>>> >>>> > mb in >>>> >>>> > these 12 hrs >>>> >>>> > VIRT stays at 85 gigs >>>> >>>> > >>>> >>>> > I understand that SHR goes up because of mmap, RES goes up because >>>> >>>> > it >>>> >>>> > is >>>> >>>> > showing SHR value as well. >>>> >>>> > >>>> >>>> > After around 10-12 hrs, the cpu utilization of the system starts >>>> >>>> > increasing, >>>> >>>> > and i notice that kswapd0 process starts becoming more active. >>>> >>>> > Gradually, >>>> >>>> > the system cpu becomes high almost 70%, and the client starts >>>> >>>> > getting >>>> >>>> > continuous timeouts. The fact that the buffers went down from 20 >>>> >>>> > mb to >>>> >>>> > 2 mb >>>> >>>> > suggests that kswapd0 is probably swapping out the pagecache. >>>> >>>> > >>>> >>>> > Is there a way out of this to avoid the kswapd0 starting to do >>>> >>>> > things >>>> >>>> > even >>>> >>>> > when there is no swap configured? >>>> >>>> > This is very easily reproducible for me, and would like a way out >>>> >>>> > of >>>> >>>> > this >>>> >>>> > situation. Do i need to adjust vm memory management stuff like >>>> >>>> > pagecache, >>>> >>>> > vfs_cache_pressure.. things like that? >>>> >>>> > >>>> >>>> > just some extra information, jna is installed, mlockall is >>>> >>>> > successful. >>>> >>>> > there >>>> >>>> > is no compaction running. >>>> >>>> > would appreciate any help on this. >>>> >>>> > Thanks >>>> >>>> > Gurpreet >>>> >>>> > >>>> >>>> > >>>> >>> >>>> >>> >>>> >>> >>>> >> >>>> > >>> >>> >>