Alright, here it goes again... Even with mmap_index_only, once the RES memory hit 15 gigs, the read latency went berserk. This happens in 12 hours if diskaccessmode is mmap, abt 48 hrs if its mmap_index_only.
only reads happening at 50 reads/second row cache size: 730 mb, row cache hit ratio: 0.75 key cache size: 400 mb, key cache hit ratio: 0.4 heap size (max 8 gigs): used 6.1-6.9 gigs No messages about reducing cache sizes in the logs stats: vmstat 1 : no swapping here, however high sys cpu utilization iostat (looks great) - avg-qu-sz = 8, avg await = 7 ms, svc time = 0.6, util = 15-30% top - VIRT - 19.8g, SHR - 6.1g, RES - 15g, high cpu, buffers - 2mb cfstats - 70-100 ms. This number used to be 20-30 ms. The value of the SHR keeps increasing (owing to mmap i guess), while at the same time buffers keeps decreasing. buffers starts as high as 50 mb, and goes down to 2 mb. This is very easily reproducible for me. Every time the RES memory hits abt 15 gigs, the client starts getting timeouts from cassandra, the sys cpu jumps a lot. All this, even though my row cache hit ratio is almost 0.75. Other than just turning off mmap completely, is there any other solution or setting to avoid a cassandra restart every cpl of days. Something to keep the RES memory to hit such a high number. I have been constantly monitoring the RES, was not seeing issues when RES was at 14 gigs. /G On Fri, Jun 8, 2012 at 10:02 PM, Gurpreet Singh <gurpreet.si...@gmail.com>wrote: > Aaron, Ruslan, > I changed the disk access mode to mmap_index_only, and it has been stable > ever since, well at least for the past 20 hours. Previously, in abt 10-12 > hours, as soon as the resident memory was full, the client would start > timing out on all its reads. It looks fine for now, i am going to let it > continue to see how long it lasts and if the problem comes again. > > Aaron, > yes, i had turned swap off. > > The total cpu utilization was at 700% roughly.. It looked like kswapd0 was > using just 1 cpu, but cassandra (jsvc) cpu utilization increased quite a > bit. top was reporting high system cpu, and low user cpu. > vmstat was not showing swapping. java heap size max is 8 gigs. while only > 4 gigs was in use, so java heap was doing great. no gc in the logs. iostat > was doing ok from what i remember, i will have to reproduce the issue for > the exact numbers. > > cfstats latency had gone very high, but that is partly due to high cpu > usage. > > One thing was clear, that the SHR was inching higher (due to the mmap) > while buffer cache which started at abt 20-25mb reduced to 2 MB by the end, > which probably means that pagecache was being evicted by the kswapd0. Is > there a way to fix the size of the buffer cache and not let system evict it > in favour of mmap? > > Also, mmapping data files would basically cause not only the data (asked > for) to be read into main memory, but also a bunch of extra pages > (readahead), which would not be very useful, right? The same thing for > index would actually be more useful, as there would be more index entries > in the readahead part.. and the index files being small wouldnt cause > memory pressure that page cache would be evicted. mmapping the data files > would make sense if the data size is smaller than the RAM or the hot data > set is smaller than the RAM, otherwise just the index would probably be a > better thing to mmap, no?. In my case data size is 85 gigs, while available > RAM is 16 gigs (only 8 gigs after heap). > > /G > > > On Fri, Jun 8, 2012 at 11:44 AM, aaron morton <aa...@thelastpickle.com>wrote: > >> Ruslan, >> Why did you suggest changing the disk_access_mode ? >> >> Gurpreet, >> I would leave the disk_access_mode with the default until you have a >> reason to change it. >> >> > 8 core, 16 gb ram, 6 data disks raid0, no swap configured >>> >> is swap disabled ? >> >> Gradually, >>> > the system cpu becomes high almost 70%, and the client starts getting >>> > continuous timeouts >>> >> 70% of one core or 70% of all cores ? >> Check the server logs, is there GC activity ? >> check nodetool cfstats to see the read latency for the cf. >> >> Take a look at vmstat to see if you are swapping, and look at iostats to >> see if io is the problem >> http://spyced.blogspot.co.nz/2010/01/linux-performance-basics.html >> >> Cheers >> >> ----------------- >> Aaron Morton >> Freelance Developer >> @aaronmorton >> http://www.thelastpickle.com >> >> On 8/06/2012, at 9:00 PM, Gurpreet Singh wrote: >> >> Thanks Ruslan. >> I will try the mmap_index_only. >> Is there any guideline as to when to leave it to auto and when to use >> mmap_index_only? >> >> /G >> >> On Fri, Jun 8, 2012 at 1:21 AM, ruslan usifov <ruslan.usi...@gmail.com>wrote: >> >>> disk_access_mode: mmap?? >>> >>> set to disk_access_mode: mmap_index_only in cassandra yaml >>> >>> 2012/6/8 Gurpreet Singh <gurpreet.si...@gmail.com>: >>> > Hi, >>> > I am testing cassandra 1.1 on a 1 node cluster. >>> > 8 core, 16 gb ram, 6 data disks raid0, no swap configured >>> > >>> > cassandra 1.1.1 >>> > heap size: 8 gigs >>> > key cache size in mb: 800 (used only 200mb till now) >>> > memtable_total_space_in_mb : 2048 >>> > >>> > I am running a read workload.. about 30 reads/second. no writes at all. >>> > The system runs fine for roughly 12 hours. >>> > >>> > jconsole shows that my heap size has hardly touched 4 gigs. >>> > top shows - >>> > SHR increasing slowly from 100 mb to 6.6 gigs in these 12 hrs >>> > RES increases slowly from 6 gigs all the way to 15 gigs >>> > buffers are at a healthy 25 mb at some point and that goes down to 2 >>> mb in >>> > these 12 hrs >>> > VIRT stays at 85 gigs >>> > >>> > I understand that SHR goes up because of mmap, RES goes up because it >>> is >>> > showing SHR value as well. >>> > >>> > After around 10-12 hrs, the cpu utilization of the system starts >>> increasing, >>> > and i notice that kswapd0 process starts becoming more active. >>> Gradually, >>> > the system cpu becomes high almost 70%, and the client starts getting >>> > continuous timeouts. The fact that the buffers went down from 20 mb to >>> 2 mb >>> > suggests that kswapd0 is probably swapping out the pagecache. >>> > >>> > Is there a way out of this to avoid the kswapd0 starting to do things >>> even >>> > when there is no swap configured? >>> > This is very easily reproducible for me, and would like a way out of >>> this >>> > situation. Do i need to adjust vm memory management stuff like >>> pagecache, >>> > vfs_cache_pressure.. things like that? >>> > >>> > just some extra information, jna is installed, mlockall is successful. >>> there >>> > is no compaction running. >>> > would appreciate any help on this. >>> > Thanks >>> > Gurpreet >>> > >>> > >>> >> >> >> >