Done. It's HBASE-12270

Please let me know if you have any comments/questions.

----------------------------------------
> Date: Mon, 13 Oct 2014 22:42:37 -0500
> Subject: RE: HBase read performance
> From: bus...@cloudera.com
> To: user@hbase.apache.org
>
> Yes, please file a JIRA and attach the test there.
>
> --
> Sean
> On Oct 13, 2014 10:22 PM, "Khaled Elmeleegy" <kd...@hotmail.com> wrote:
>
>> Actually, it's very easy to reproduce. I have a simple program that
>> reproduces it if run against a simple toy cluster with one RS. If shared
>> the program with Ted. I can file a jira ticket about it if you want.
>> Khaled
>>
>>> Date: Mon, 13 Oct 2014 19:12:13 -0700
>>> From: la...@apache.org
>>> Subject: Re: HBase read performance
>>> To: user@hbase.apache.org
>>>
>>> This looks like something to track.
>>> If there's an easy way to reproduce this, could you file a jira with
>> instructions?
>>> (Although from the conversations here, it looks like it's not trivial to
>> reproduce).
>>>
>>> -- Lars
>>>
>>>
>>>
>>> ________________________________
>>> From: Khaled Elmeleegy <kd...@hotmail.com>
>>> To: "user@hbase.apache.org" <user@hbase.apache.org>
>>> Sent: Monday, October 13, 2014 12:13 PM
>>> Subject: RE: HBase read performance
>>>
>>>
>>> I went through a lengthy testing process with Ted. I tried both
>> disabling just cache on write and both COW and BC. Disabling them makes the
>> problem go away. I think disabling COW in my workload effectively doesn't
>> exercise the block cache as my workload is mostly writes, so disabling COW
>> doesn't cause the cache blocks to churn.
>>> I tried both overcommitting memory and not. This is irrelevant though.
>> The cache operates on the virtual memory offered to it and it's agnostic to
>> the physical memory size. Memory over commitment may only hurt performance
>> due to swapping by the OS, but this has nothing to do with the bug.
>>> Khaled
>>>
>>>> Date: Sun, 12 Oct 2014 23:17:24 -0700
>>>> Subject: Re: HBase read performance
>>>> From: st...@duboce.net
>>>> To: user@hbase.apache.org
>>>>
>>>> On Sun, Oct 12, 2014 at 8:58 PM, Khaled Elmeleegy <kd...@hotmail.com>
>> wrote:
>>>>
>>>>> It goes away, but i think that's because i am basically churning the
>> cache
>>>>> far less. My workload is mostly writes.
>>>>>
>>>>>
>>>> What exactly did you change?
>>>>
>>>> Did you disable both BC and cacheonwrite or just one of these configs?
>>>>
>>>> Disabling BC and/or cacheonwrite would churn your cache more?
>>>>
>>>> What about the heap config? Did you over commit mem? If so, did you
>> try
>>>> doing a direct memory sizing that was w/i the bounds of physical memory
>>>> (well within)? Did you still see exceptions?
>>>>
>>>> St.Ack
>>>>
>>>>
>>>>
>>>>
>>>>>> Date: Sun, 12 Oct 2014 19:43:16 -0700
>>>>>> Subject: Re: HBase read performance
>>>>>> From: yuzhih...@gmail.com
>>>>>> To: user@hbase.apache.org
>>>>>>
>>>>>> Hi,
>>>>>> Can you turn off cacheonwrite and keep bucket cache to see if the
>> problem
>>>>>> goes away ?
>>>>>>
>>>>>> Cheers
>>>>>>
>>>>>> On Fri, Oct 10, 2014 at 10:59 AM, Khaled Elmeleegy <
>> kd...@hotmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Yes, I can reproduce it with some work.
>>>>>>> The workload is basically as follows:
>>>>>>> There are writers streaming writes to a table. Then, there is a
>> reader
>>>>>>> (invoked via a web interface). The reader does a 1000 parallel
>> reverse
>>>>>>> scans, all end up hitting the same region in my case. The scans
>> are
>>>>>>> effectively "gets" as I just need to get one record off of each
>> of
>>>>> them. I
>>>>>>> just need to do a "reverse" get, which is not supported (would be
>>>>> great to
>>>>>>> have :)), so I do it via reverse scan. After few tries, the
>> reader
>>>>>>> consistently hits this bug.
>>>>>>>
>>>>>>> This happens with these config changes:
>>>>>>> hbase-env:HBASE_REGIONSERVER_OPTS=-Xmx6G
>> -XX:MaxDirectMemorySize=5G
>>>>>>> -XX:CMSInitiatingOccupancyFraction=88 -XX:+AggressiveOpts
>> -verbose:gc
>>>>>>> -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Xlog
>>>>>>> gc:/tmp/hbase-regionserver-gc.loghbase-site:
>>>>>>> hbase.bucketcache.ioengine=offheap
>>>>>>> hbase.bucketcache.size=4196
>>>>>>> hbase.rs.cacheblocksonwrite=true
>>>>>>> hfile.block.index.cacheonwrite=true
>>>>>>> hfile.block.bloom.cacheonwrite=true
>>>>>>>
>>>>>>> Interestingly, without these config changes, I can't reproduce
>> the
>>>>> problem.
>>>>>>> Khaled
>>>>>>>
>>>>>>>
>>>>>>>> Date: Fri, 10 Oct 2014 10:05:14 -0700
>>>>>>>> Subject: Re: HBase read performance
>>>>>>>> From: st...@duboce.net
>>>>>>>> To: user@hbase.apache.org
>>>>>>>>
>>>>>>>> It looks like we are messing up our positioning math:
>>>>>>>>
>>>>>>>> 233 <
>>>>>>>
>>>>>
>> http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/java/nio/Buffer.java#233
>>>>>>>>
>>>>>>>>
>>>>>>>> <
>>>>>>>
>>>>>
>> http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/java/nio/Buffer.java#
>>>>>>>>
>>>>>>>>
>>>>>>>> public final Buffer
>>>>>>>> <
>>>>>>>
>>>>>
>> http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/java/nio/Buffer.java#Buffer
>>>>>>>>
>>>>>>>> <
>>>>>>>
>>>>>
>> http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/java/nio/Buffer.java#
>>>>>>>>position(int
>>>>>>>> newPosition) {
>>>>>>>>
>>>>>>>> 234 <
>>>>>>>
>>>>>
>> http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/java/nio/Buffer.java#234
>>>>>>>>
>>>>>>>>
>>>>>>>> <
>>>>>>>
>>>>>
>> http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/java/nio/Buffer.java#
>>>>>>>>
>>>>>>>>
>>>>>>>> if ((newPosition> limit
>>>>>>>> <
>>>>>>>
>>>>>
>> http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/java/nio/Buffer.java#Buffer.0limit
>>>>>>>>)
>>>>>>>> || (newPosition < 0))
>>>>>>>>
>>>>>>>> 235 <
>>>>>>>
>>>>>
>> http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/java/nio/Buffer.java#235
>>>>>>>>
>>>>>>>>
>>>>>>>> <
>>>>>>>
>>>>>
>> http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/java/nio/Buffer.java#
>>>>>>>>
>>>>>>>>
>>>>>>>> throw new IllegalArgumentException
>>>>>>>> <
>>>>>>>
>>>>>
>> http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/java/lang/IllegalArgumentException.java#IllegalArgumentException
>>>>>>>>();
>>>>>>>>
>>>>>>>>
>>>>>>>> Is it easy to reproduce Khaled? Always same region/store or
>> spread
>>>>>>>> across all reads?
>>>>>>>>
>>>>>>>>
>>>>>>>> St.Ack
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Fri, Oct 10, 2014 at 8:31 AM, Khaled Elmeleegy <
>> kd...@hotmail.com
>>>>>>
>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Andrew thanks. I think this indeed was the problem.
>>>>>>>>> To get over it, I increased the amount of memory given the
>> region
>>>>>>> server
>>>>>>>>> to avoid IO on reads. I used the configs below. In my
>> experiments,
>>>>> I
>>>>>>> have
>>>>>>>>> writers streaming their output to HBase. The reader powers a
>> web
>>>>> page
>>>>>>> and
>>>>>>>>> does this scatter/gather, where it reads 1000 keys written
>> last and
>>>>>>> passes
>>>>>>>>> them the the front end. With this workload, I get the
>> exception
>>>>> below
>>>>>>> at
>>>>>>>>> the region server. Again, I am using HBAse (0.98.6.1). Any
>> help is
>>>>>>>>> appreciated.
>>>>>>>>> hbase-env:HBASE_REGIONSERVER_OPTS=-Xmx6G
>> -XX:MaxDirectMemorySize=5G
>>>>>>>>> -XX:CMSInitiatingOccupancyFraction=88 -XX:+AggressiveOpts
>>>>> -verbose:gc
>>>>>>>>> -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Xlog
>>>>>>>>> gc:/tmp/hbase-regionserver-gc.log
>>>>>>>>> hbase-site:hbase.bucketcache.ioengine=offheap
>>>>>>>>>
>>>>>>>>> hfile.block.cache.size=0.4
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> hbase.bucketcache.size=4196
>>>>>>>>>
>>>>>>>>> hbase.rs.cacheblocksonwrite=true
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> hfile.block.index.cacheonwrite=true
>>>>>>>>>
>>>>>>>>> hfile.block.bloom.cacheonwrite=true
>>>>>>>>>
>>>>>>>>> 2014-10-10 15:06:44,173 ERROR
>>>>>>>>> [B.DefaultRpcServer.handler=62,queue=2,port=60020]
>> ipc.RpcServer:
>>>>>>>>> Unexpected throwable object
>>>>>>>>> java.lang.IllegalArgumentException
>>>>>>>>> at java.nio.Buffer.position(Buffer.java:236)
>>>>>>>>> at
>>>>>>>>>
>>>>>>>
>>>>>
>> org.apache.hadoop.hbase.util.ByteBufferUtils.skip(ByteBufferUtils.java:434)
>>>>>>>>> at
>>>>>>>>>
>>>>>>>
>>>>>
>> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.readKeyValueLen(HFileReaderV2.java:849)
>>>>>>>>> at
>>>>>>>>>
>>>>>>>
>>>>>
>> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.next(HFileReaderV2.java:760)
>>>>>>>>> at
>>>>>>>>>
>>>>>>>
>>>>>
>> org.apache.hadoop.hbase.regionserver.StoreFileScanner.seekAtOrAfter(StoreFileScanner.java:248)
>>>>>>>>> at
>>>>>>>>>
>>>>>>>
>>>>>
>> org.apache.hadoop.hbase.regionserver.StoreFileScanner.seek(StoreFileScanner.java:152)
>>>>>>>>> at
>>>>>>>>>
>>>>>>>
>>>>>
>> org.apache.hadoop.hbase.regionserver.StoreScanner.seekScanners(StoreScanner.java:317)
>>>>>>>>> at
>>>>>>>>>
>>>>>>>
>>>>>
>> org.apache.hadoop.hbase.regionserver.StoreScanner.<init>(StoreScanner.java:176)
>>>>>>>>> at
>>>>>>>>>
>>>>>>>
>>>>>
>> org.apache.hadoop.hbase.regionserver.HStore.getScanner(HStore.java:1780)
>>>>>>>>> at
>>>>>>>>>
>>>>>>>
>>>>>
>> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.<init>(HRegion.java:3758)
>>>>>>>>> at
>>>>>>>>>
>>>>>>>
>>>>>
>> org.apache.hadoop.hbase.regionserver.HRegion.instantiateRegionScanner(HRegion.java:1950)
>>>>>>>>> at
>>>>>>>>>
>>>>>>>
>>>>>
>> org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:1936)
>>>>>>>>> at
>>>>>>>>>
>>>>>>>
>>>>>
>> org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:1913)
>>>>>>>>> at
>>>>>>>>>
>>>>>>>
>>>>>
>> org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3157)
>>>>>>>>> at
>>>>>>>>>
>>>>>>>
>>>>>
>> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29587)
>>>>>>>>> at
>>>>>>> org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2027)
>>>>>>>>> at
>>>>>>> org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108)
>>>>>>>>> at
>>>>>>>>>
>>>>>>>
>>>>>
>> org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
>>>>>>>>> at
>>>>>>>>>
>> org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
>>>>>>>>> at java.lang.Thread.run(Thread.java:744)
>>>>>>>>>> From: apurt...@apache.org
>>>>>>>>>> Date: Tue, 7 Oct 2014 17:09:35 -0700
>>>>>>>>>> Subject: Re: HBase read performance
>>>>>>>>>> To: user@hbase.apache.org
>>>>>>>>>>
>>>>>>>>>>> The cluster has 2 m1.large nodes.
>>>>>>>>>>
>>>>>>>>>> That's the problem right there.
>>>>>>>>>>
>>>>>>>>>> You need to look at c3.4xlarge or i2 instances as a minimum
>>>>>>> requirement.
>>>>>>>>> M1
>>>>>>>>>> and even M3 instance types have ridiculously poor IO.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Tue, Oct 7, 2014 at 3:01 PM, Khaled Elmeleegy <
>>>>> kd...@hotmail.com>
>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Thanks Nicolas, Qiang.
>>>>>>>>>>>
>>>>>>>>>>> I was able to write a simple program that reproduces the
>>>>> problem
>>>>>>> on a
>>>>>>>>> tiny
>>>>>>>>>>> HBase cluster on ec2. The cluster has 2 m1.large nodes.
>> One
>>>>> node
>>>>>>> runs
>>>>>>>>> the
>>>>>>>>>>> master, name node and zookeeper. The other node runs a
>> data
>>>>> node
>>>>>>> and a
>>>>>>>>>>> region server, with heap size configured to be 6GB.
>> There, the
>>>>> 1000
>>>>>>>>>>> parallel reverse gets (reverse scans) take 7-8 seconds.
>> The
>>>>> data
>>>>>>> set is
>>>>>>>>>>> tiny (10M records, each having a small number of bytes).
>> As I
>>>>> said
>>>>>>>>> before,
>>>>>>>>>>> all hardware resources are very idle there.
>>>>>>>>>>>
>>>>>>>>>>> Interestingly, running the same workload on my macbook,
>> the
>>>>> 1000
>>>>>>>>> parallel
>>>>>>>>>>> gets take ~200ms on a pseudo-distributed installation.
>>>>>>>>>>>
>>>>>>>>>>> Any help to resolve this mystery is highly appreciated.
>>>>>>>>>>>
>>>>>>>>>>> P.S. please find my test program attached.
>>>>>>>>>>>
>>>>>>>>>>> Best,
>>>>>>>>>>> Khaled
>>>>>>>>>>>
>>>>>>>>>>>> From: nkey...@gmail.com
>>>>>>>>>>>> Date: Mon, 6 Oct 2014 09:40:48 +0200
>>>>>>>>>>>> Subject: Re: HBase read performance
>>>>>>>>>>>> To: user@hbase.apache.org
>>>>>>>>>>>>
>>>>>>>>>>>> Hi,
>>>>>>>>>>>>
>>>>>>>>>>>> I haven't seen it mentioned, but if I understand
>> correctly
>>>>> each
>>>>>>> scan
>>>>>>>>>>>> returns a single row? If so you should use
>> Scan#setSmall to
>>>>> save
>>>>>>>>> some rpc
>>>>>>>>>>>> calls.
>>>>>>>>>>>>
>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>
>>>>>>>>>>>> Nicolas
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Sun, Oct 5, 2014 at 11:28 AM, Qiang Tian <
>>>>> tian...@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> when using separate HConnection instance, both its
>>>>>>>>>>>>> RpcClient instance(maintain connection to a
>> regionserver)
>>>>> and
>>>>>>>>> Registry
>>>>>>>>>>>>> instance(maintain connection to zookeeper) will be
>>>>> separate..
>>>>>>>>>>>>>
>>>>>>>>>>>>> see
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>
>> http://shammijayasinghe.blogspot.com/2012/02/zookeeper-increase-maximum-number-of.html
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Sun, Oct 5, 2014 at 2:24 PM, Khaled Elmeleegy <
>>>>>>>>> kd...@hotmail.com>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> I tried creating my own HConnections pool to use
>> for my
>>>>> HBase
>>>>>>>>> calls,
>>>>>>>>>>> so
>>>>>>>>>>>>>> that not all the (2K) threads share the same
>> HConnection.
>>>>>>>>> However, I
>>>>>>>>>>>>> could
>>>>>>>>>>>>>> only have 10 HConnections. Beyond that I get ZK
>>>>> exceptions,
>>>>>>>>> please
>>>>>>>>>>> find
>>>>>>>>>>>>> it
>>>>>>>>>>>>>> below. Also, with 10 HConnections, I don't see
>> noticeable
>>>>>>>>>>> improvement in
>>>>>>>>>>>>>> performance so far.
>>>>>>>>>>>>>> 2014-10-05 06:11:26,490 WARN [main]
>>>>>>>>> zookeeper.RecoverableZooKeeper
>>>>>>>>>>>>>> (RecoverableZooKeeper.java:retryOrThrow(253)) -
>> Possibly
>>>>>>>>> transient
>>>>>>>>>>>>>> ZooKeeper, quorum=54.68.206.252:2181,
>>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>
>> exception=org.apache.zookeeper.KeeperException$ConnectionLossException:
>>>>>>>>>>>>>> KeeperErrorCode = ConnectionLoss for
>>>>> /hbase/hbaseid2014-10-05
>>>>>>>>>>>>> 06:11:26,490
>>>>>>>>>>>>>> INFO [main] util.RetryCounter
>>>>>>>>>>>>> (RetryCounter.java:sleepUntilNextRetry(155))
>>>>>>>>>>>>>> - Sleeping 1000ms before retry #0...2014-10-05
>>>>> 06:11:27,845
>>>>>>> WARN
>>>>>>>>>>> [main]
>>>>>>>>>>>>>> zookeeper.RecoverableZooKeeper
>>>>>>>>>>>>>> (RecoverableZooKeeper.java:retryOrThrow(253)) -
>> Possibly
>>>>>>>>> transient
>>>>>>>>>>>>>> ZooKeeper, quorum=54.68.206.252:2181,
>>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>
>> exception=org.apache.zookeeper.KeeperException$ConnectionLossException:
>>>>>>>>>>>>>> KeeperErrorCode = ConnectionLoss for
>>>>> /hbase/hbaseid2014-10-05
>>>>>>>>>>>>> 06:11:27,849
>>>>>>>>>>>>>> INFO [main] util.RetryCounter
>>>>>>>>>>>>> (RetryCounter.java:sleepUntilNextRetry(155))
>>>>>>>>>>>>>> - Sleeping 2000ms before retry #1...2014-10-05
>>>>> 06:11:30,405
>>>>>>> WARN
>>>>>>>>>>> [main]
>>>>>>>>>>>>>> zookeeper.RecoverableZooKeeper
>>>>>>>>>>>>>> (RecoverableZooKeeper.java:retryOrThrow(253)) -
>> Possibly
>>>>>>>>> transient
>>>>>>>>>>>>>> ZooKeeper, quorum=54.68.206.252:2181,
>>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>
>> exception=org.apache.zookeeper.KeeperException$ConnectionLossException:
>>>>>>>>>>>>>> KeeperErrorCode = ConnectionLoss for
>>>>> /hbase/hbaseid2014-10-05
>>>>>>>>>>>>> 06:11:30,405
>>>>>>>>>>>>>> INFO [main] util.RetryCounter
>>>>>>>>>>>>> (RetryCounter.java:sleepUntilNextRetry(155))
>>>>>>>>>>>>>> - Sleeping 4000ms before retry #2...2014-10-05
>>>>> 06:11:35,278
>>>>>>> WARN
>>>>>>>>>>> [main]
>>>>>>>>>>>>>> zookeeper.RecoverableZooKeeper
>>>>>>>>>>>>>> (RecoverableZooKeeper.java:retryOrThrow(253)) -
>> Possibly
>>>>>>>>> transient
>>>>>>>>>>>>>> ZooKeeper, quorum=54.68.206.252:2181,
>>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>
>> exception=org.apache.zookeeper.KeeperException$ConnectionLossException:
>>>>>>>>>>>>>> KeeperErrorCode = ConnectionLoss for
>>>>> /hbase/hbaseid2014-10-05
>>>>>>>>>>>>> 06:11:35,279
>>>>>>>>>>>>>> INFO [main] util.RetryCounter
>>>>>>>>>>>>> (RetryCounter.java:sleepUntilNextRetry(155))
>>>>>>>>>>>>>> - Sleeping 8000ms before retry #3...2014-10-05
>>>>> 06:11:44,393
>>>>>>> WARN
>>>>>>>>>>> [main]
>>>>>>>>>>>>>> zookeeper.RecoverableZooKeeper
>>>>>>>>>>>>>> (RecoverableZooKeeper.java:retryOrThrow(253)) -
>> Possibly
>>>>>>>>> transient
>>>>>>>>>>>>>> ZooKeeper, quorum=54.68.206.252:2181,
>>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>
>> exception=org.apache.zookeeper.KeeperException$ConnectionLossException:
>>>>>>>>>>>>>> KeeperErrorCode = ConnectionLoss for
>>>>> /hbase/hbaseid2014-10-05
>>>>>>>>>>>>> 06:11:44,393
>>>>>>>>>>>>>> ERROR [main] zookeeper.RecoverableZooKeeper
>>>>>>>>>>>>>> (RecoverableZooKeeper.java:retryOrThrow(255)) -
>> ZooKeeper
>>>>>>> exists
>>>>>>>>>>> failed
>>>>>>>>>>>>>> after 4 attempts2014-10-05 06:11:44,394 WARN [main]
>>>>>>>>> zookeeper.ZKUtil
>>>>>>>>>>>>>> (ZKUtil.java:checkExists(482)) -
>> hconnection-0x4e174f3b,
>>>>>>> quorum=
>>>>>>>>>>>>>> 54.68.206.252:2181, baseZNode=/hbase Unable to set
>>>>> watcher
>>>>>>> on
>>>>>>>>> znode
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>
>> (/hbase/hbaseid)org.apache.zookeeper.KeeperException$ConnectionLossException:
>>>>>>>>>>>>>> KeeperErrorCode = ConnectionLoss for
>> /hbase/hbaseid at
>>>>>>>>>>>>>>
>>>>>>>>>
>>>>> org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
>>>>>>>>>>> at
>>>>>>>>>>>>>>
>>>>>>>>>
>>>>> org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
>>>>>>>>>>> at
>>>>>>>>>>>>>>
>>>>> org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1041) at
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>
>> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:199)
>>>>>>>>>>>>>> at
>>>>>>>>>>>
>>>>>>>
>> org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:479)
>>>>>>>>>>>>>> at
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>
>> org.apache.hadoop.hbase.zookeeper.ZKClusterId.readClusterIdZNode(ZKClusterId.java:65)
>>>>>>>>>>>>>> at
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>
>> org.apache.hadoop.hbase.client.ZooKeeperRegistry.getClusterId(ZooKeeperRegistry.java:83)
>>>>>>>>>>>>>> at
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>
>> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.retrieveClusterId(HConnectionManager.java:857)
>>>>>>>>>>>>>> at
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>
>> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.<init>(HConnectionManager.java:662)
>>>>>>>>>>>>>> at
>>>>>>> sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
>>>>>>>>>>>>>> Method) at
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>
>> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>>>>>>>>>>>>>> at
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>
>> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>>>>>>>>>>>>>> at
>>>>>>>>>
>> java.lang.reflect.Constructor.newInstance(Constructor.java:526)
>>>>>>>>>>>>> at
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>
>> org.apache.hadoop.hbase.client.HConnectionManager.createConnection(HConnectionManager.java:414)
>>>>>>>>>>>>>> at
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>
>> org.apache.hadoop.hbase.client.HConnectionManager.createConnection(HConnectionManager.java:335)...
>>>>>>>>>>>>>>> From: kd...@hotmail.com
>>>>>>>>>>>>>>> To: user@hbase.apache.org
>>>>>>>>>>>>>>> Subject: RE: HBase read performance
>>>>>>>>>>>>>>> Date: Fri, 3 Oct 2014 12:37:39 -0700
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Lars, Ted, and Qiang,
>>>>>>>>>>>>>>> Thanks for all the input.
>>>>>>>>>>>>>>> Qiang: yes all the threads are in the same client
>>>>> process
>>>>>>>>> sharing
>>>>>>>>>>> the
>>>>>>>>>>>>>> same connection. And since I don't see hardware
>>>>> contention,
>>>>>>> may
>>>>>>>>> be
>>>>>>>>>>> there
>>>>>>>>>>>>> is
>>>>>>>>>>>>>> contention over this code path. I'll try using many
>>>>>>> connections
>>>>>>>>> and
>>>>>>>>>>> see
>>>>>>>>>>>>> if
>>>>>>>>>>>>>> it alleviates the problems and I'll report back.
>>>>>>>>>>>>>>> Thanks again,Khaled
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Date: Fri, 3 Oct 2014 15:18:30 +0800
>>>>>>>>>>>>>>>> Subject: Re: HBase read performance
>>>>>>>>>>>>>>>> From: tian...@gmail.com
>>>>>>>>>>>>>>>> To: user@hbase.apache.org
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Regarding to profiling, Andrew introduced
>>>>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>> http://www.brendangregg.com/blog/2014-06-12/java-flame-graphs.html
>>>>>>>>>>>>>> months
>>>>>>>>>>>>>>>> ago.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> processCallTime comes from RpcServer#call, so
>> it
>>>>> looks
>>>>>>> good?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I have a suspect:
>>>>>>>>>>> https://issues.apache.org/jira/browse/HBASE-11306
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> how many processes do you have for your 2000
>> threads?
>>>>>>>>>>>>>>>> if olny 1 process, those threads will share
>> just 1
>>>>>>>>> connection to
>>>>>>>>>>> that
>>>>>>>>>>>>>>>> regionserver, there might be big contention on
>> the
>>>>> RPC
>>>>>>> code
>>>>>>>>> path.
>>>>>>>>>>>>>> ---for
>>>>>>>>>>>>>>>> such case, could you try using different
>> connections?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>
>> https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HConnectionManager.html
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Fri, Oct 3, 2014 at 9:55 AM, Ted Yu <
>>>>>>> yuzhih...@gmail.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Khaled:
>>>>>>>>>>>>>>>>> Do you have profiler such as jprofiler ?
>>>>>>>>>>>>>>>>> Profiling would give us more hint.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Otherwise capturing stack trace during the
>> period
>>>>> of
>>>>>>>>> reverse
>>>>>>>>>>> scan
>>>>>>>>>>>>>> would
>>>>>>>>>>>>>>>>> help.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Cheers
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Thu, Oct 2, 2014 at 4:52 PM, lars
>> hofhansl <
>>>>>>>>>>> la...@apache.org>
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> You might have the data in the OS buffer
>> cache,
>>>>>>> without
>>>>>>>>> short
>>>>>>>>>>>>>> circuit
>>>>>>>>>>>>>>>>>> reading the region server has to request
>> the
>>>>> block
>>>>>>> from
>>>>>>>>> the
>>>>>>>>>>> data
>>>>>>>>>>>>>> node
>>>>>>>>>>>>>>>>>> process, which then reads it from the block
>>>>> cache.
>>>>>>>>>>>>>>>>>> That is a few context switches per RPC
>> that do
>>>>> not
>>>>>>> show
>>>>>>>>> up
>>>>>>>>>>> in CPU
>>>>>>>>>>>>>>>>> metrics.
>>>>>>>>>>>>>>>>>> In that you also would not see disk IO.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> If - as you say - you see a lot of evicted
>>>>> blocks the
>>>>>>>>> data
>>>>>>>>>>> *has*
>>>>>>>>>>>>>> to come
>>>>>>>>>>>>>>>>>> from the OS. If you do not see disk IO is
>> *has*
>>>>> to
>>>>>>> come
>>>>>>>>> from
>>>>>>>>>>> the
>>>>>>>>>>>>> OS
>>>>>>>>>>>>>>>>> cache.
>>>>>>>>>>>>>>>>>> I.e. there's more RAM on your boxes, and
>> you
>>>>> should
>>>>>>>>> increase
>>>>>>>>>>> the
>>>>>>>>>>>>>> heap
>>>>>>>>>>>>>>>>> block
>>>>>>>>>>>>>>>>>> cache.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> You can measure the context switches with
>> vmstat.
>>>>>>> Other
>>>>>>>>> than
>>>>>>>>>>> that
>>>>>>>>>>>>>> I have
>>>>>>>>>>>>>>>>>> no suggestion until I reproduce the
>> problem.
>>>>>>>>>>>>>>>>>> Also check the data locality index of the
>> region
>>>>>>> server
>>>>>>>>> it
>>>>>>>>>>> should
>>>>>>>>>>>>>> be
>>>>>>>>>>>>>>>>> close
>>>>>>>>>>>>>>>>>> to 100%.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> -- Lars
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> ________________________________
>>>>>>>>>>>>>>>>>> From: Khaled Elmeleegy <kd...@hotmail.com>
>>>>>>>>>>>>>>>>>> To: "user@hbase.apache.org" <
>>>>> user@hbase.apache.org>
>>>>>>>>>>>>>>>>>> Sent: Thursday, October 2, 2014 3:24 PM
>>>>>>>>>>>>>>>>>> Subject: RE: HBase read performance
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Lars thanks a lot about all the tips. I'll
>> make
>>>>> sure
>>>>>>> I
>>>>>>>>> cover
>>>>>>>>>>> all
>>>>>>>>>>>>>> of them
>>>>>>>>>>>>>>>>>> and get back to you. I am not sure they
>> are the
>>>>>>>>> bottleneck
>>>>>>>>>>> though
>>>>>>>>>>>>>> as they
>>>>>>>>>>>>>>>>>> all are about optimizing physical resource
>>>>> usage. As
>>>>>>> I
>>>>>>>>> said,
>>>>>>>>>>> I
>>>>>>>>>>>>>> don't see
>>>>>>>>>>>>>>>>>> any contended physical resources now. I'll
>> also
>>>>> try
>>>>>>> to
>>>>>>>>>>> reproduce
>>>>>>>>>>>>>> this
>>>>>>>>>>>>>>>>>> problem in a simpler environment and pass
>> to you
>>>>> the
>>>>>>> test
>>>>>>>>>>> program
>>>>>>>>>>>>>> to play
>>>>>>>>>>>>>>>>>> with.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Couple of high level points to make. You
>> are
>>>>> right
>>>>>>> that
>>>>>>>>> my
>>>>>>>>>>> use
>>>>>>>>>>>>>> case is
>>>>>>>>>>>>>>>>>> kind of a worst case for HBase reads. But,
>> if
>>>>> things
>>>>>>> go
>>>>>>>>> the
>>>>>>>>>>> way
>>>>>>>>>>>>> you
>>>>>>>>>>>>>>>>>> described them, there should be tons of
>> disk IO
>>>>> and
>>>>>>> that
>>>>>>>>>>> should
>>>>>>>>>>>>> be
>>>>>>>>>>>>>>>>> clearly
>>>>>>>>>>>>>>>>>> the bottleneck. This is not the case
>> though.
>>>>> That's
>>>>>>> for
>>>>>>>>> the
>>>>>>>>>>>>> simple
>>>>>>>>>>>>>> reason
>>>>>>>>>>>>>>>>>> that this is done in a test environment (I
>> am
>>>>> still
>>>>>>>>>>> prototyping),
>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>> not a
>>>>>>>>>>>>>>>>>> lot of data is yet written to HBase.
>> However for
>>>>> the
>>>>>>>>> real use
>>>>>>>>>>>>>> case, there
>>>>>>>>>>>>>>>>>> should writers constantly writing data to
>> HBase
>>>>> and
>>>>>>>>> readers
>>>>>>>>>>>>>> occasionally
>>>>>>>>>>>>>>>>>> doing this scatter/gather. At steady state,
>>>>> things
>>>>>>> should
>>>>>>>>>>> only
>>>>>>>>>>>>> get
>>>>>>>>>>>>>> worse
>>>>>>>>>>>>>>>>>> and all the issues you mentioned should
>> get far
>>>>> more
>>>>>>>>>>> pronounced.
>>>>>>>>>>>>>> At this
>>>>>>>>>>>>>>>>>> point, one can try to mitigate it using
>> more
>>>>> memory
>>>>>>> or
>>>>>>>>> so. I
>>>>>>>>>>> am
>>>>>>>>>>>>>> not there
>>>>>>>>>>>>>>>>>> yet as I think I am hitting some software
>>>>> bottleneck,
>>>>>>>>> which I
>>>>>>>>>>>>>> don't know
>>>>>>>>>>>>>>>>>> how to work around.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Khaled
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> ----------------------------------------
>>>>>>>>>>>>>>>>>>> Date: Thu, 2 Oct 2014 14:20:47 -0700
>>>>>>>>>>>>>>>>>>> From: la...@apache.org
>>>>>>>>>>>>>>>>>>> Subject: Re: HBase read performance
>>>>>>>>>>>>>>>>>>> To: user@hbase.apache.org
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> OK... We might need to investigate this.
>>>>>>>>>>>>>>>>>>> Any chance that you can provide a
>> minimal test
>>>>>>> program
>>>>>>>>> and
>>>>>>>>>>>>>> instruction
>>>>>>>>>>>>>>>>>> about how to set it up.
>>>>>>>>>>>>>>>>>>> We can do some profiling then.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> One thing to note is that with scanning
>> HBase
>>>>>>> cannot
>>>>>>>>> use
>>>>>>>>>>> bloom
>>>>>>>>>>>>>> filters
>>>>>>>>>>>>>>>>>> to rule out HFiles ahead of time, it needs
>> to
>>>>> look
>>>>>>> into
>>>>>>>>> all
>>>>>>>>>>> of
>>>>>>>>>>>>>> them.
>>>>>>>>>>>>>>>>>>> So you kind of hit on the absolute worst
>> case:
>>>>>>>>>>>>>>>>>>> - random reads that do not fit into the
>> block
>>>>> cache
>>>>>>>>>>>>>>>>>>> - cannot use bloom filters
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Few more question/comments:
>>>>>>>>>>>>>>>>>>> - Do you have short circuit reading
>> enabled? If
>>>>>>> not,
>>>>>>>>> you
>>>>>>>>>>>>> should.
>>>>>>>>>>>>>>>>>>> - Is your table major compacted? That
>> will
>>>>> reduce
>>>>>>> the
>>>>>>>>>>> number of
>>>>>>>>>>>>>> files
>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>> look at.
>>>>>>>>>>>>>>>>>>> - Did you disable Nagle's everywhere
>> (enabled
>>>>>>>>> tcpnodelay)?
>>>>>>>>>>> It
>>>>>>>>>>>>>> disabled
>>>>>>>>>>>>>>>>>> by default in HBase, but necessarily in
>> your
>>>>> install
>>>>>>> of
>>>>>>>>> HDFS.
>>>>>>>>>>>>>>>>>>> - Which version of HDFS are you using as
>>>>> backing
>>>>>>>>>>> filesystem?
>>>>>>>>>>>>>>>>>>> - If your disk is idle, it means the
>> data fits
>>>>> into
>>>>>>>>> the OS
>>>>>>>>>>>>> buffer
>>>>>>>>>>>>>>>>> cache.
>>>>>>>>>>>>>>>>>> In turn that means that you increase the
>> heap
>>>>> for the
>>>>>>>>> region
>>>>>>>>>>>>>> servers. You
>>>>>>>>>>>>>>>>>> can also use block encoding (FAST_DIFF) to
>> try to
>>>>>>> make
>>>>>>>>> sure
>>>>>>>>>>> the
>>>>>>>>>>>>>> entire
>>>>>>>>>>>>>>>>>> working set fits into the cache.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> - Also try to reduce the block size -
>> although
>>>>> if
>>>>>>> your
>>>>>>>>>>> overall
>>>>>>>>>>>>>> working
>>>>>>>>>>>>>>>>>> set does not fit in the heap it won't help
>> much.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> This is a good section of the book to
>> read
>>>>> through
>>>>>>>>>>> generally
>>>>>>>>>>>>>> (even
>>>>>>>>>>>>>>>>>> though you might know most of this
>> already):
>>>>>>>>>>>>>>>>>>
>>>>>>> http://hbase.apache.org/book.html#perf.configurations
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> -- Lars
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> ----- Original Message -----
>>>>>>>>>>>>>>>>>>> From: Khaled Elmeleegy <
>> kd...@hotmail.com>
>>>>>>>>>>>>>>>>>>> To: "user@hbase.apache.org" <
>>>>> user@hbase.apache.org
>>>>>>>>
>>>>>>>>>>>>>>>>>>> Cc:
>>>>>>>>>>>>>>>>>>> Sent: Thursday, October 2, 2014 11:27 AM
>>>>>>>>>>>>>>>>>>> Subject: RE: HBase read performance
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I do see a very brief spike in CPU
>>>>> (user/system),
>>>>>>> but
>>>>>>>>> it's
>>>>>>>>>>> no
>>>>>>>>>>>>>> where
>>>>>>>>>>>>>>>>> near
>>>>>>>>>>>>>>>>>> 0% idle. It goes from 99% idle down to
>> something
>>>>>>> like 40%
>>>>>>>>>>> idle
>>>>>>>>>>>>> for
>>>>>>>>>>>>>> a
>>>>>>>>>>>>>>>>> second
>>>>>>>>>>>>>>>>>> or so. The thing to note, this is all on a
>> test
>>>>>>> cluster,
>>>>>>>>> so
>>>>>>>>>>> no
>>>>>>>>>>>>>> real load.
>>>>>>>>>>>>>>>>>> Things are generally idle until i issue
>> 2-3 of
>>>>> these
>>>>>>>>>>>>>> multi-scan-requests
>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>> render a web page. Then, you see the spike
>> in the
>>>>>>> cpu and
>>>>>>>>>>> some
>>>>>>>>>>>>>> activity
>>>>>>>>>>>>>>>>> in
>>>>>>>>>>>>>>>>>> the network and disk, but nowhere near
>>>>> saturation.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> If there are specific tests you'd like
>> me to
>>>>> do to
>>>>>>>>> debug
>>>>>>>>>>> this,
>>>>>>>>>>>>>> I'd be
>>>>>>>>>>>>>>>>>> more than happy to do it.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Khaled
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> ----------------------------------------
>>>>>>>>>>>>>>>>>>>> Date: Thu, 2 Oct 2014 11:15:59 -0700
>>>>>>>>>>>>>>>>>>>> From: la...@apache.org
>>>>>>>>>>>>>>>>>>>> Subject: Re: HBase read performance
>>>>>>>>>>>>>>>>>>>> To: user@hbase.apache.org
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> I still think you're waiting on disk. No
>>>>> IOWAIT?
>>>>>>> So
>>>>>>>>> CPU
>>>>>>>>>>> is not
>>>>>>>>>>>>>> waiting
>>>>>>>>>>>>>>>>>> a lot for IO. No high User/System CPU
>> either?
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> If you see a lot of evicted block then
>> each
>>>>> RPC
>>>>>>> has a
>>>>>>>>> high
>>>>>>>>>>>>>> chance of
>>>>>>>>>>>>>>>>>> requiring to bring an entire 64k block in.
>> You'll
>>>>>>> see bad
>>>>>>>>>>>>>> performance
>>>>>>>>>>>>>>>>> with
>>>>>>>>>>>>>>>>>> this.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> We might need to trace this specific
>> scenario.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> -- Lars
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> ________________________________
>>>>>>>>>>>>>>>>>>>> From: Khaled Elmeleegy <
>> kd...@hotmail.com>
>>>>>>>>>>>>>>>>>>>> To: "user@hbase.apache.org" <
>>>>>>> user@hbase.apache.org>
>>>>>>>>>>>>>>>>>>>> Sent: Thursday, October 2, 2014 10:46 AM
>>>>>>>>>>>>>>>>>>>> Subject: RE: HBase read performance
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> I've set the heap size to 6GB and I do
>> have gc
>>>>>>>>> logging. No
>>>>>>>>>>>>> long
>>>>>>>>>>>>>> pauses
>>>>>>>>>>>>>>>>>> there -- occasional 0.1s or 0.2s.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Other than the discrepancy between
>> what's
>>>>>>> reported on
>>>>>>>>> the
>>>>>>>>>>>>>> client and
>>>>>>>>>>>>>>>>>> what's reported at the RS, there is also
>> the
>>>>> issue
>>>>>>> of not
>>>>>>>>>>> getting
>>>>>>>>>>>>>> proper
>>>>>>>>>>>>>>>>>> concurrency. So, even if a reverse get
>> takes
>>>>> 100ms
>>>>>>> or so
>>>>>>>>>>> (this
>>>>>>>>>>>>> has
>>>>>>>>>>>>>> to be
>>>>>>>>>>>>>>>>>> mostly blocking on various things as no
>> physical
>>>>>>>>> resource is
>>>>>>>>>>>>>> contended),
>>>>>>>>>>>>>>>>>> then the other gets/scans should be able to
>>>>> proceed
>>>>>>> in
>>>>>>>>>>> parallel,
>>>>>>>>>>>>>> so a
>>>>>>>>>>>>>>>>>> thousand concurrent gets/scans should
>> finish in
>>>>> few
>>>>>>>>> hundreds
>>>>>>>>>>> of
>>>>>>>>>>>>> ms
>>>>>>>>>>>>>> not
>>>>>>>>>>>>>>>>> many
>>>>>>>>>>>>>>>>>> seconds. That's why I thought I'd increase
>> the
>>>>>>> handlers
>>>>>>>>>>> count to
>>>>>>>>>>>>>> try to
>>>>>>>>>>>>>>>>> get
>>>>>>>>>>>>>>>>>> more concurrency, but it didn't help. So,
>> there
>>>>> must
>>>>>>> be
>>>>>>>>>>> something
>>>>>>>>>>>>>> else.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Khaled
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> ----------------------------------------
>>>>>>>>>>>>>>>>>>>>> From: ndimi...@gmail.com
>>>>>>>>>>>>>>>>>>>>> Date: Thu, 2 Oct 2014 10:36:39 -0700
>>>>>>>>>>>>>>>>>>>>> Subject: Re: HBase read performance
>>>>>>>>>>>>>>>>>>>>> To: user@hbase.apache.org
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Do check again on the heap size of the
>> region
>>>>>>>>> servers.
>>>>>>>>>>> The
>>>>>>>>>>>>>> default
>>>>>>>>>>>>>>>>>>>>> unconfigured size is 1G; too small for
>> much
>>>>> of
>>>>>>>>> anything.
>>>>>>>>>>>>> Check
>>>>>>>>>>>>>> your
>>>>>>>>>>>>>>>>> RS
>>>>>>>>>>>>>>>>>> logs
>>>>>>>>>>>>>>>>>>>>> -- look for lines produced by the
>>>>> JVMPauseMonitor
>>>>>>>>> thread.
>>>>>>>>>>>>> They
>>>>>>>>>>>>>>>>> usually
>>>>>>>>>>>>>>>>>>>>> correlate with long GC pauses or other
>>>>>>> process-freeze
>>>>>>>>>>> events.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Get is implemented as a Scan of a
>> single
>>>>> row, so
>>>>>>> a
>>>>>>>>>>> reverse
>>>>>>>>>>>>>> scan of a
>>>>>>>>>>>>>>>>>> single
>>>>>>>>>>>>>>>>>>>>> row should be functionally equivalent.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> In practice, I have seen discrepancy
>> between
>>>>> the
>>>>>>>>>>> latencies
>>>>>>>>>>>>>> reported
>>>>>>>>>>>>>>>>> by
>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>> RS and the latencies experienced by the
>>>>> client.
>>>>>>> I've
>>>>>>>>> not
>>>>>>>>>>>>>> investigated
>>>>>>>>>>>>>>>>>> this
>>>>>>>>>>>>>>>>>>>>> area thoroughly.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> On Thu, Oct 2, 2014 at 10:05 AM, Khaled
>>>>>>> Elmeleegy <
>>>>>>>>>>>>>> kd...@hotmail.com
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Thanks Lars for your quick reply.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Yes performance is similar with less
>>>>> handlers (I
>>>>>>>>> tried
>>>>>>>>>>> with
>>>>>>>>>>>>>> 100
>>>>>>>>>>>>>>>>>> first).
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> The payload is not big ~1KB or so. The
>>>>> working
>>>>>>> set
>>>>>>>>>>> doesn't
>>>>>>>>>>>>>> seem to
>>>>>>>>>>>>>>>>>> fit in
>>>>>>>>>>>>>>>>>>>>>> memory as there are many cache misses.
>>>>> However,
>>>>>>>>> disk is
>>>>>>>>>>> far
>>>>>>>>>>>>>> from
>>>>>>>>>>>>>>>>>> being a
>>>>>>>>>>>>>>>>>>>>>> bottleneck. I checked using iostat. I
>> also
>>>>>>> verified
>>>>>>>>> that
>>>>>>>>>>>>>> neither the
>>>>>>>>>>>>>>>>>>>>>> network nor the CPU of the region
>> server or
>>>>> the
>>>>>>>>> client
>>>>>>>>>>> are a
>>>>>>>>>>>>>>>>>> bottleneck.
>>>>>>>>>>>>>>>>>>>>>> This leads me to believe that likely
>> this
>>>>> is a
>>>>>>>>> software
>>>>>>>>>>>>>> bottleneck,
>>>>>>>>>>>>>>>>>>>>>> possibly due to a misconfiguration on
>> my
>>>>> side. I
>>>>>>>>> just
>>>>>>>>>>> don't
>>>>>>>>>>>>>> know how
>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>> debug it. A clear disconnect I see is
>> the
>>>>>>> individual
>>>>>>>>>>> request
>>>>>>>>>>>>>> latency
>>>>>>>>>>>>>>>>>> as
>>>>>>>>>>>>>>>>>>>>>> reported by metrics on the region
>> server
>>>>> (IPC
>>>>>>>>>>>>> processCallTime
>>>>>>>>>>>>>> vs
>>>>>>>>>>>>>>>>>> scanNext)
>>>>>>>>>>>>>>>>>>>>>> vs what's measured on the client.
>> Does this
>>>>>>> sound
>>>>>>>>>>> right? Any
>>>>>>>>>>>>>> ideas
>>>>>>>>>>>>>>>>> on
>>>>>>>>>>>>>>>>>> how
>>>>>>>>>>>>>>>>>>>>>> to better debug it?
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> About this trick with the timestamps
>> to be
>>>>> able
>>>>>>> to
>>>>>>>>> do a
>>>>>>>>>>>>>> forward
>>>>>>>>>>>>>>>>> scan,
>>>>>>>>>>>>>>>>>>>>>> thanks for pointing it out. Actually,
>> I am
>>>>>>> aware of
>>>>>>>>> it.
>>>>>>>>>>> The
>>>>>>>>>>>>>> problem
>>>>>>>>>>>>>>>>> I
>>>>>>>>>>>>>>>>>> have
>>>>>>>>>>>>>>>>>>>>>> is, sometimes I want to get the key
>> after a
>>>>>>>>> particular
>>>>>>>>>>>>>> timestamp and
>>>>>>>>>>>>>>>>>>>>>> sometimes I want to get the key
>> before, so
>>>>> just
>>>>>>>>> relying
>>>>>>>>>>> on
>>>>>>>>>>>>>> the key
>>>>>>>>>>>>>>>>>> order
>>>>>>>>>>>>>>>>>>>>>> doesn't work. Ideally, I want a
>> reverse
>>>>> get(). I
>>>>>>>>> thought
>>>>>>>>>>>>>> reverse
>>>>>>>>>>>>>>>>> scan
>>>>>>>>>>>>>>>>>> can
>>>>>>>>>>>>>>>>>>>>>> do the trick though.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Khaled
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>> ----------------------------------------
>>>>>>>>>>>>>>>>>>>>>>> Date: Thu, 2 Oct 2014 09:40:37 -0700
>>>>>>>>>>>>>>>>>>>>>>> From: la...@apache.org
>>>>>>>>>>>>>>>>>>>>>>> Subject: Re: HBase read performance
>>>>>>>>>>>>>>>>>>>>>>> To: user@hbase.apache.org
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Hi Khaled,
>>>>>>>>>>>>>>>>>>>>>>> is it the same with fewer threads?
>> 1500
>>>>> handler
>>>>>>>>> threads
>>>>>>>>>>>>>> seems to
>>>>>>>>>>>>>>>>> be a
>>>>>>>>>>>>>>>>>>>>>> lot. Typically a good number of
>> threads
>>>>> depends
>>>>>>> on
>>>>>>>>> the
>>>>>>>>>>>>>> hardware
>>>>>>>>>>>>>>>>>> (number of
>>>>>>>>>>>>>>>>>>>>>> cores, number of spindles, etc). I
>> cannot
>>>>> think
>>>>>>> of
>>>>>>>>> any
>>>>>>>>>>> type
>>>>>>>>>>>>> of
>>>>>>>>>>>>>>>>>> scenario
>>>>>>>>>>>>>>>>>>>>>> where more than 100 would give any
>>>>> improvement.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> How large is the payload per KV
>> retrieved
>>>>> that
>>>>>>>>> way? If
>>>>>>>>>>>>> large
>>>>>>>>>>>>>> (as
>>>>>>>>>>>>>>>>> in a
>>>>>>>>>>>>>>>>>>>>>> few 100k) you definitely want to
>> lower the
>>>>>>> number
>>>>>>>>> of the
>>>>>>>>>>>>>> handler
>>>>>>>>>>>>>>>>>> threads.
>>>>>>>>>>>>>>>>>>>>>>> How much heap do you give the region
>>>>> server?
>>>>>>> Does
>>>>>>>>> the
>>>>>>>>>>>>>> working set
>>>>>>>>>>>>>>>>> fit
>>>>>>>>>>>>>>>>>>>>>> into the cache? (i.e. in the metrics,
>> do you
>>>>>>> see the
>>>>>>>>>>>>> eviction
>>>>>>>>>>>>>> count
>>>>>>>>>>>>>>>>>> going
>>>>>>>>>>>>>>>>>>>>>> up, if so it does not fit into the
>> cache).
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> If the working set does not fit into
>> the
>>>>> cache
>>>>>>>>>>> (eviction
>>>>>>>>>>>>>> count goes
>>>>>>>>>>>>>>>>>> up)
>>>>>>>>>>>>>>>>>>>>>> then HBase will need to bring a new
>> block in
>>>>>>> from
>>>>>>>>> disk
>>>>>>>>>>> on
>>>>>>>>>>>>>> each Get
>>>>>>>>>>>>>>>>>>>>>> (assuming the Gets are more or less
>> random
>>>>> as
>>>>>>> far
>>>>>>>>> as the
>>>>>>>>>>>>>> server is
>>>>>>>>>>>>>>>>>>>>>> concerned).
>>>>>>>>>>>>>>>>>>>>>>> In case you'll benefit from reducing
>> the
>>>>> HFile
>>>>>>>>> block
>>>>>>>>>>> size
>>>>>>>>>>>>>> (from 64k
>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>> 8k or even 4k).
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Lastly I don't think we tested the
>>>>> performance
>>>>>>> of
>>>>>>>>> using
>>>>>>>>>>>>>> reverse
>>>>>>>>>>>>>>>>> scan
>>>>>>>>>>>>>>>>>>>>>> this way, there is probably room to
>> optimize
>>>>>>> this.
>>>>>>>>>>>>>>>>>>>>>>> Can you restructure your keys to
>> allow
>>>>> forwards
>>>>>>>>>>> scanning?
>>>>>>>>>>>>> For
>>>>>>>>>>>>>>>>> example
>>>>>>>>>>>>>>>>>>>>>> you could store the time as
>> MAX_LONG-time.
>>>>> Or
>>>>>>> you
>>>>>>>>> could
>>>>>>>>>>>>>> invert all
>>>>>>>>>>>>>>>>>> the bits
>>>>>>>>>>>>>>>>>>>>>> of the time portion of the key, so
>> that it
>>>>> sort
>>>>>>> the
>>>>>>>>>>> other
>>>>>>>>>>>>>> way. Then
>>>>>>>>>>>>>>>>>> you
>>>>>>>>>>>>>>>>>>>>>> could do a forward scan.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Let us know how it goes.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> -- Lars
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> ----- Original Message -----
>>>>>>>>>>>>>>>>>>>>>>> From: Khaled Elmeleegy <
>> kd...@hotmail.com>
>>>>>>>>>>>>>>>>>>>>>>> To: "user@hbase.apache.org" <
>>>>>>> user@hbase.apache.org
>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Cc:
>>>>>>>>>>>>>>>>>>>>>>> Sent: Thursday, October 2, 2014
>> 12:12 AM
>>>>>>>>>>>>>>>>>>>>>>> Subject: HBase read performance
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> I am trying to do a scatter/gather
>> on hbase
>>>>>>>>> (0.98.6.1),
>>>>>>>>>>>>>> where I
>>>>>>>>>>>>>>>>> have
>>>>>>>>>>>>>>>>>> a
>>>>>>>>>>>>>>>>>>>>>> client reading ~1000 keys from an
>> HBase
>>>>> table.
>>>>>>> These
>>>>>>>>>>> keys
>>>>>>>>>>>>>> happen to
>>>>>>>>>>>>>>>>>> fall on
>>>>>>>>>>>>>>>>>>>>>> the same region server. For my reads
>> I use
>>>>>>> reverse
>>>>>>>>> scan
>>>>>>>>>>> to
>>>>>>>>>>>>>> read each
>>>>>>>>>>>>>>>>>> key as
>>>>>>>>>>>>>>>>>>>>>> I want the key prior to a specific
>> time
>>>>> stamp
>>>>>>> (time
>>>>>>>>>>> stamps
>>>>>>>>>>>>> are
>>>>>>>>>>>>>>>>> stored
>>>>>>>>>>>>>>>>>> in
>>>>>>>>>>>>>>>>>>>>>> reverse order). I don't believe gets
>> can
>>>>>>> accomplish
>>>>>>>>>>> that,
>>>>>>>>>>>>>> right? so
>>>>>>>>>>>>>>>>> I
>>>>>>>>>>>>>>>>>> use
>>>>>>>>>>>>>>>>>>>>>> scan, with caching set to 1.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> I use 2000 reader threads in the
>> client
>>>>> and on
>>>>>>>>> HBase,
>>>>>>>>>>> I've
>>>>>>>>>>>>>> set
>>>>>>>>>>>>>>>>>>>>>> hbase.regionserver.handler.count to
>> 1500.
>>>>> With
>>>>>>> this
>>>>>>>>>>> setup,
>>>>>>>>>>>>> my
>>>>>>>>>>>>>>>>> scatter
>>>>>>>>>>>>>>>>>>>>>> gather is very slow and can take up
>> to 10s
>>>>> in
>>>>>>> total.
>>>>>>>>>>> Timing
>>>>>>>>>>>>> an
>>>>>>>>>>>>>>>>>> individual
>>>>>>>>>>>>>>>>>>>>>> getScanner(..) call on the client
>> side, it
>>>>> can
>>>>>>>>> easily
>>>>>>>>>>> take
>>>>>>>>>>>>> few
>>>>>>>>>>>>>>>>>> hundreds of
>>>>>>>>>>>>>>>>>>>>>> ms. I also got the following metrics
>> from
>>>>> the
>>>>>>> region
>>>>>>>>>>> server
>>>>>>>>>>>>> in
>>>>>>>>>>>>>>>>>> question:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> "queueCallTime_mean" :
>> 2.190855525775637,
>>>>>>>>>>>>>>>>>>>>>>> "queueCallTime_median" : 0.0,
>>>>>>>>>>>>>>>>>>>>>>> "queueCallTime_75th_percentile" :
>> 0.0,
>>>>>>>>>>>>>>>>>>>>>>> "queueCallTime_95th_percentile" :
>> 1.0,
>>>>>>>>>>>>>>>>>>>>>>> "queueCallTime_99th_percentile" :
>>>>>>>>> 556.9799999999818,
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> "processCallTime_min" : 0,
>>>>>>>>>>>>>>>>>>>>>>> "processCallTime_max" : 12755,
>>>>>>>>>>>>>>>>>>>>>>> "processCallTime_mean" :
>>>>> 105.64873440912682,
>>>>>>>>>>>>>>>>>>>>>>> "processCallTime_median" : 0.0,
>>>>>>>>>>>>>>>>>>>>>>> "processCallTime_75th_percentile" :
>> 2.0,
>>>>>>>>>>>>>>>>>>>>>>> "processCallTime_95th_percentile" :
>>>>> 7917.95,
>>>>>>>>>>>>>>>>>>>>>>> "processCallTime_99th_percentile" :
>>>>> 8876.89,
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>
>> "namespace_default_table_delta_region_87be70d7710f95c05cfcc90181d183b4_metric_scanNext_min"
>>>>>>>>>>>>>>>>>>>>>> : 89,
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>
>> "namespace_default_table_delta_region_87be70d7710f95c05cfcc90181d183b4_metric_scanNext_max"
>>>>>>>>>>>>>>>>>>>>>> : 11300,
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>
>> "namespace_default_table_delta_region_87be70d7710f95c05cfcc90181d183b4_metric_scanNext_mean"
>>>>>>>>>>>>>>>>>>>>>> : 654.4949739797315,
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>
>> "namespace_default_table_delta_region_87be70d7710f95c05cfcc90181d183b4_metric_scanNext_median"
>>>>>>>>>>>>>>>>>>>>>> : 101.0,
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>
>> "namespace_default_table_delta_region_87be70d7710f95c05cfcc90181d183b4_metric_scanNext_75th_percentile"
>>>>>>>>>>>>>>>>>>>>>> : 101.0,
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>
>> "namespace_default_table_delta_region_87be70d7710f95c05cfcc90181d183b4_metric_scanNext_95th_percentile"
>>>>>>>>>>>>>>>>>>>>>> : 101.0,
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>
>> "namespace_default_table_delta_region_87be70d7710f95c05cfcc90181d183b4_metric_scanNext_99th_percentile"
>>>>>>>>>>>>>>>>>>>>>> : 113.0,
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Where "delta" is the name of the
>> table I am
>>>>>>>>> querying.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> In addition to all this, i monitored
>> the
>>>>>>> hardware
>>>>>>>>>>> resources
>>>>>>>>>>>>>> (CPU,
>>>>>>>>>>>>>>>>>> disk,
>>>>>>>>>>>>>>>>>>>>>> and network) of both the client and
>> the
>>>>> region
>>>>>>>>> server
>>>>>>>>>>> and
>>>>>>>>>>>>>> nothing
>>>>>>>>>>>>>>>>>> seems
>>>>>>>>>>>>>>>>>>>>>> anywhere near saturation. So I am
>> puzzled by
>>>>>>> what's
>>>>>>>>>>> going on
>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>> where this
>>>>>>>>>>>>>>>>>>>>>> time is going.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Few things to note based on the above
>>>>>>> measurements:
>>>>>>>>>>> both
>>>>>>>>>>>>>> medians of
>>>>>>>>>>>>>>>>>> IPC
>>>>>>>>>>>>>>>>>>>>>> processCallTime and queueCallTime are
>>>>> basically
>>>>>>> zero
>>>>>>>>>>> (ms I
>>>>>>>>>>>>>> presume,
>>>>>>>>>>>>>>>>>>>>>> right?). However, scanNext_median is
>> 101 (ms
>>>>>>> too,
>>>>>>>>>>> right?). I
>>>>>>>>>>>>>> am not
>>>>>>>>>>>>>>>>>> sure
>>>>>>>>>>>>>>>>>>>>>> how this adds up. Also, even though
>> the 101
>>>>>>> figure
>>>>>>>>> seems
>>>>>>>>>>>>>>>>> outrageously
>>>>>>>>>>>>>>>>>> high
>>>>>>>>>>>>>>>>>>>>>> and I don't know why, still all these
>> scans
>>>>>>> should
>>>>>>>>> be
>>>>>>>>>>>>>> happening in
>>>>>>>>>>>>>>>>>>>>>> parallel, so the overall call should
>> finish
>>>>>>> fast,
>>>>>>>>> given
>>>>>>>>>>> that
>>>>>>>>>>>>>> no
>>>>>>>>>>>>>>>>>> hardware
>>>>>>>>>>>>>>>>>>>>>> resource is contended, right? but
>> this is
>>>>> not
>>>>>>> what's
>>>>>>>>>>>>>> happening, so I
>>>>>>>>>>>>>>>>>> have
>>>>>>>>>>>>>>>>>>>>>> to be missing something(s).
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> So, any help is appreciated there.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>>> Khaled
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Best regards,
>>>>>>>>>>
>>>>>>>>>> - Andy
>>>>>>>>>>
>>>>>>>>>> Problems worthy of attack prove their worth by hitting
>> back. -
>>>>> Piet
>>>>>>> Hein
>>>>>>>>>> (via Tom White)
>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>
>>>>>
>>
                                          

Reply via email to