Hang on, how are you using 11G total memory? m1.large only has 7.5G total RAM.
On Fri, Oct 10, 2014 at 2:56 PM, Nick Dimiduk <ndimi...@gmail.com> wrote: > ByteBuffer position math errors makes me suspect #1 cacheonwrite and #2 > bucketcache (and #3 their use in combination ;) ) > > 11G memory is a small enough footprint that I'd not bother with > BucketCache; just stay on heap with default LruBlockCache. > > On Fri, Oct 10, 2014 at 2:11 PM, Stack <st...@duboce.net> wrote: > >> On Fri, Oct 10, 2014 at 10:59 AM, Khaled Elmeleegy <kd...@hotmail.com> >> wrote: >> >> > Yes, I can reproduce it with some work. >> > The workload is basically as follows: >> > There are writers streaming writes to a table. Then, there is a reader >> > (invoked via a web interface). The reader does a 1000 parallel reverse >> > scans, all end up hitting the same region in my case. The scans are >> > effectively "gets" as I just need to get one record off of each of >> them. I >> > just need to do a "reverse" get, which is not supported (would be great >> to >> > have :)), so I do it via reverse scan. After few tries, the reader >> > consistently hits this bug. >> > >> > This happens with these config changes: >> > hbase-env:HBASE_REGIONSERVER_OPTS=-Xmx6G -XX:MaxDirectMemorySize=5G >> > -XX:CMSInitiatingOccupancyFraction=88 -XX:+AggressiveOpts -verbose:gc >> > -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Xlog >> > gc:/tmp/hbase-regionserver-gc.loghbase-site: >> > hbase.bucketcache.ioengine=offheap >> > hbase.bucketcache.size=4196 >> > hbase.rs.cacheblocksonwrite=true >> > hfile.block.index.cacheonwrite=true >> > hfile.block.bloom.cacheonwrite=true >> > >> > Interestingly, without these config changes, I can't reproduce the >> problem. >> >> >> >> How hard to play w/ combinations? Could you eliminate the cacheonwrites >> on >> one server and see if that cures the issue? Could trun off block cache on >> another to see if that the problem? Anything in your .out files related? >> >> St.Ack >> > >