[ 
https://issues.apache.org/jira/browse/HBASE-16604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15522451#comment-15522451
 ] 

ramkrishna.s.vasudevan edited comment on HBASE-16604 at 9/26/16 8:41 AM:
-------------------------------------------------------------------------

[~enis]
Due to recent failure in master branch test cases I had to check this fix. Few 
questions on the fix. Pls do correct me if I was wrong.
{code}
  // The scanner state might be left in a dirty state, so we will tell the 
Client to
          // fail this RPC and close the scanner while opening up another one 
from the start of
          // row that the client has last seen.
          closeScanner(region, scanner, scannerName, context);

          // We closed the scanner already. Instead of throwing the 
IOException, and client
          // retrying with the same scannerId only to get USE on the next RPC, 
we directly throw
          // a special exception to save an RPC.
          if (VersionInfoUtil.hasMinimumVersion(context.getClientVersionInfo(), 
1, 4)) {
            // 1.4.0+ clients know how to handle
            throw new ScannerResetException("Scanner is closed on the 
server-side", e);
          } else {
            // older clients do not know about SRE. Just throw USE, which they 
will handle
            throw new UnknownScannerException("Throwing UnknownScannerException 
to reset the client"
                + " scanner state for clients older than 1.3.", e);
          }
{code}
So in the catch block first closeScanner happens and then a 
ScannerresetException of type DNRE is thrown. 
Inside closeScanner we first remove the scanner from the scanner's map and then 
cancel the lease also. (the lease is removed from the lease holder).
Now the client on seeing this exception will try to retry this Exception. Since 
the scanner is removed from the scanner's map and already we have a scannerId 
associated with the scan request,
{code}
if (request.hasScannerId()) {
        rsh = scanners.get(scannerName);
        if (rsh == null) {
          LOG.warn("Client tried to access missing scanner " + scannerName);
          throw new UnknownScannerException(
            "Unknown scanner '" + scannerName + "'. This can happen due to any 
of the following "
                + "reasons: a) Scanner id given is wrong, b) Scanner lease 
expired because of "
                + "long wait between consecutive client checkins, c) Server may 
be closing down, "
                + "d) RegionServer restart during upgrade.\nIf the issue is due 
to reason (b), a "
                + "possible fix would be increasing the value of"
                + "'hbase.client.scanner.timeout.period' configuration.");
        }
{code}
Will not the subsequent retry also fail? 
The other problem here is that since the above catch block is immediately 
followed by a finally block
{code}
finally {
          if (context != null) {
            context.setCallBack(rsh.shippedCallback);
          }
          // Adding resets expiration time on lease.
          if (scanners.containsKey(scannerName)) {
            ttl = this.scannerLeaseTimeoutPeriod;
            // When context != null, adding back the lease will be done in 
callback set above.
            if (context == null) {
              if (lease != null) regionServer.leases.addLease(lease);
            }
          }
        }
{code}
the rpcCallback (when context  !=null) is set with the shippedCallback. Now 
even before this since the patch tries to close the scanner the scanner is 
removed from the scanner map and hence the RegionScannerShippedCallBack#run() 
will not be able to add the lease back and hence because of this the 
LeaseExpiry listener does not do the necessary clean up and hence the block 
eviction also fails. 
So the 2nd part of the problem is what I was checking and I found the first 
part also to have some problem which I may be wrong.

Now we have 2 things I think,
->In case of actual retries whether the scanner internals and its heap are 
reset properly
-> In case my retries are over how am I cleaning up the heap and also the 
blocks. This will happen only for master branch I think and we need to fix only 
in 2.0.
Thanks [~enis].


was (Author: ram_krish):
[~enis]
Due to recent failure in master branch test cases I had to check this fix. Few 
questions on the fix. Pls do correct me if I was wrong.
{code}
{code}

> Scanner retries on IOException can cause the scans to miss data 
> ----------------------------------------------------------------
>
>                 Key: HBASE-16604
>                 URL: https://issues.apache.org/jira/browse/HBASE-16604
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver, Scanners
>            Reporter: Enis Soztutar
>            Assignee: Enis Soztutar
>             Fix For: 2.0.0, 1.3.0, 1.4.0, 1.1.7, 1.2.4
>
>         Attachments: HBASE-16604-branch-1.3-addendum.patch, 
> hbase-16604_v1.patch, hbase-16604_v2.patch, hbase-16604_v3.branch-1.patch, 
> hbase-16604_v3.patch
>
>
> Debugging an ITBLL failure, where the Verify did not "see" all the data in 
> the cluster, I've noticed that if we end up getting a generic IOException 
> from the HFileReader level, we may end up missing the rest of the data in the 
> region. I was able to manually test this, and this stack trace helps to 
> understand what is going on: 
> {code}
> 2016-09-09 16:27:15,633 INFO  [hconnection-0x71ad3d8a-shared--pool21-t9] 
> client.ScannerCallable(376): Open scanner=1 for 
> scan={"loadColumnFamiliesOnDemand":null,"startRow":"","stopRow":"","batch":-1,"cacheBlocks":true,"totalColumns":1,"maxResultSize":2097152,"families":{"testFamily":["testFamily"]},"caching":100,"maxVersions":1,"timeRange":[0,9223372036854775807]}
>  on region 
> region=testScanThrowsException,,1473463632707.b2adfb618e5d0fe225c1dc40c0eabfee.,
>  hostname=hw10676,51833,1473463626529, seqNum=2
> 2016-09-09 16:27:15,634 INFO  
> [B.fifo.QRpcServer.handler=5,queue=0,port=51833] 
> regionserver.RSRpcServices(2196): scan request:scanner_id: 1 number_of_rows: 
> 100 close_scanner: false next_call_seq: 0 client_handles_partials: true 
> client_handles_heartbeats: true renew: false
> 2016-09-09 16:27:15,635 INFO  
> [B.fifo.QRpcServer.handler=5,queue=0,port=51833] 
> regionserver.RSRpcServices(2510): Rolling back next call seqId
> 2016-09-09 16:27:15,635 INFO  
> [B.fifo.QRpcServer.handler=5,queue=0,port=51833] 
> regionserver.RSRpcServices(2565): Throwing new 
> ServiceExceptionjava.io.IOException: Could not reseek 
> StoreFileScanner[HFileScanner for reader 
> reader=hdfs://localhost:51795/user/enis/test-data/d6fb1c70-93c1-4099-acb7-5723fc05a737/data/default/testScanThrowsException/b2adfb618e5d0fe225c1dc40c0eabfee/testFamily/5a213cc23b714e5e8e1a140ebbe72f2c,
>  compression=none, cacheConf=blockCache=LruBlockCache{blockCount=0, 
> currentSize=1567264, freeSize=1525578848, maxSize=1527146112, 
> heapSize=1567264, minSize=1450788736, minFactor=0.95, multiSize=725394368, 
> multiFactor=0.5, singleSize=362697184, singleFactor=0.25}, 
> cacheDataOnRead=true, cacheDataOnWrite=false, cacheIndexesOnWrite=false, 
> cacheBloomsOnWrite=false, cacheEvictOnClose=false, cacheDataCompressed=false, 
> prefetchOnOpen=false, firstKey=aaa/testFamily:testFamily/1473463633859/Put, 
> lastKey=zzz/testFamily:testFamily/1473463634271/Put, avgKeyLen=35, 
> avgValueLen=3, entries=17576, length=866998, 
> cur=/testFamily:/OLDEST_TIMESTAMP/Minimum/vlen=0/seqid=0] to key 
> /testFamily:testFamily/LATEST_TIMESTAMP/Maximum/vlen=0/seqid=0
> 2016-09-09 16:27:15,635 DEBUG 
> [B.fifo.QRpcServer.handler=5,queue=0,port=51833] ipc.CallRunner(110): 
> B.fifo.QRpcServer.handler=5,queue=0,port=51833: callId: 26 service: 
> ClientService methodName: Scan size: 26 connection: 192.168.42.75:51903
> java.io.IOException: Could not reseek StoreFileScanner[HFileScanner for 
> reader 
> reader=hdfs://localhost:51795/user/enis/test-data/d6fb1c70-93c1-4099-acb7-5723fc05a737/data/default/testScanThrowsException/b2adfb618e5d0fe225c1dc40c0eabfee/testFamily/5a213cc23b714e5e8e1a140ebbe72f2c,
>  compression=none, cacheConf=blockCache=LruBlockCache{blockCount=0, 
> currentSize=1567264, freeSize=1525578848, maxSize=1527146112, 
> heapSize=1567264, minSize=1450788736, minFactor=0.95, multiSize=725394368, 
> multiFactor=0.5, singleSize=362697184, singleFactor=0.25}, 
> cacheDataOnRead=true, cacheDataOnWrite=false, cacheIndexesOnWrite=false, 
> cacheBloomsOnWrite=false, cacheEvictOnClose=false, cacheDataCompressed=false, 
> prefetchOnOpen=false, firstKey=aaa/testFamily:testFamily/1473463633859/Put, 
> lastKey=zzz/testFamily:testFamily/1473463634271/Put, avgKeyLen=35, 
> avgValueLen=3, entries=17576, length=866998, 
> cur=/testFamily:/OLDEST_TIMESTAMP/Minimum/vlen=0/seqid=0] to key 
> /testFamily:testFamily/LATEST_TIMESTAMP/Maximum/vlen=0/seqid=0
>       at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:224)
>       at 
> org.apache.hadoop.hbase.regionserver.NonLazyKeyValueScanner.doRealSeek(NonLazyKeyValueScanner.java:55)
>       at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:312)
>       at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.requestSeek(KeyValueHeap.java:268)
>       at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:815)
>       at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.seekAsDirection(StoreScanner.java:803)
>       at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:625)
>       at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:147)
>       at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.populateResult(HRegion.java:5736)
>       at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:5887)
>       at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:5674)
>       at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:2434)
>       at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32295)
>       at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2127)
>       at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:107)
>       at 
> org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133)
>       at org.apache.hadoop.hbase.ipc.RpcExecutor$2.run(RpcExecutor.java:108)
>       at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: foo bar baz
>       at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:207)
>       ... 17 more
> 2016-09-09 16:27:15,653 INFO  [hconnection-0x71ad3d8a-shared--pool21-t10] 
> client.ScannerCallable(261): Got exception making request scanner_id: 1 
> number_of_rows: 100 close_scanner: false next_call_seq: 0 
> client_handles_partials: true client_handles_heartbeats: true renew: false to 
> region=testScanThrowsException,,1473463632707.b2adfb618e5d0fe225c1dc40c0eabfee.,
>  hostname=hw10676,51833,1473463626529, seqNum=2
> java.io.IOException: java.io.IOException: Could not reseek 
> StoreFileScanner[HFileScanner for reader 
> reader=hdfs://localhost:51795/user/enis/test-data/d6fb1c70-93c1-4099-acb7-5723fc05a737/data/default/testScanThrowsException/b2adfb618e5d0fe225c1dc40c0eabfee/testFamily/5a213cc23b714e5e8e1a140ebbe72f2c,
>  compression=none, cacheConf=blockCache=LruBlockCache{blockCount=0, 
> currentSize=1567264, freeSize=1525578848, maxSize=1527146112, 
> heapSize=1567264, minSize=1450788736, minFactor=0.95, multiSize=725394368, 
> multiFactor=0.5, singleSize=362697184, singleFactor=0.25}, 
> cacheDataOnRead=true, cacheDataOnWrite=false, cacheIndexesOnWrite=false, 
> cacheBloomsOnWrite=false, cacheEvictOnClose=false, cacheDataCompressed=false, 
> prefetchOnOpen=false, firstKey=aaa/testFamily:testFamily/1473463633859/Put, 
> lastKey=zzz/testFamily:testFamily/1473463634271/Put, avgKeyLen=35, 
> avgValueLen=3, entries=17576, length=866998, 
> cur=/testFamily:/OLDEST_TIMESTAMP/Minimum/vlen=0/seqid=0] to key 
> /testFamily:testFamily/LATEST_TIMESTAMP/Maximum/vlen=0/seqid=0
>       at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:224)
>       at 
> org.apache.hadoop.hbase.regionserver.NonLazyKeyValueScanner.doRealSeek(NonLazyKeyValueScanner.java:55)
>       at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:312)
>       at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.requestSeek(KeyValueHeap.java:268)
>       at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:815)
>       at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.seekAsDirection(StoreScanner.java:803)
>       at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:625)
>       at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:147)
>       at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.populateResult(HRegion.java:5736)
>       at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:5887)
>       at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:5674)
>       at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:2434)
>       at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32295)
>       at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2127)
>       at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:107)
>       at 
> org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133)
>       at org.apache.hadoop.hbase.ipc.RpcExecutor$2.run(RpcExecutor.java:108)
>       at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: foo bar baz
>       at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:207)
>       ... 17 more
>       at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>       at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>       at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>       at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
>       at 
> org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
>       at 
> org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95)
>       at 
> org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRemoteException(ProtobufUtil.java:332)
>       at 
> org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:256)
>       at 
> org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:1)
>       at 
> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:200)
>       at 
> org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:364)
>       at 
> org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:1)
>       at 
> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:126)
>       at 
> org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture.run(ResultBoundedCompletionService.java:65)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>       at java.lang.Thread.run(Thread.java:745)
> Caused by: 
> org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(java.io.IOException): 
> java.io.IOException: Could not reseek StoreFileScanner[HFileScanner for 
> reader 
> reader=hdfs://localhost:51795/user/enis/test-data/d6fb1c70-93c1-4099-acb7-5723fc05a737/data/default/testScanThrowsException/b2adfb618e5d0fe225c1dc40c0eabfee/testFamily/5a213cc23b714e5e8e1a140ebbe72f2c,
>  compression=none, cacheConf=blockCache=LruBlockCache{blockCount=0, 
> currentSize=1567264, freeSize=1525578848, maxSize=1527146112, 
> heapSize=1567264, minSize=1450788736, minFactor=0.95, multiSize=725394368, 
> multiFactor=0.5, singleSize=362697184, singleFactor=0.25}, 
> cacheDataOnRead=true, cacheDataOnWrite=false, cacheIndexesOnWrite=false, 
> cacheBloomsOnWrite=false, cacheEvictOnClose=false, cacheDataCompressed=false, 
> prefetchOnOpen=false, firstKey=aaa/testFamily:testFamily/1473463633859/Put, 
> lastKey=zzz/testFamily:testFamily/1473463634271/Put, avgKeyLen=35, 
> avgValueLen=3, entries=17576, length=866998, 
> cur=/testFamily:/OLDEST_TIMESTAMP/Minimum/vlen=0/seqid=0] to key 
> /testFamily:testFamily/LATEST_TIMESTAMP/Maximum/vlen=0/seqid=0
>       at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:224)
>       at 
> org.apache.hadoop.hbase.regionserver.NonLazyKeyValueScanner.doRealSeek(NonLazyKeyValueScanner.java:55)
>       at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:312)
>       at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.requestSeek(KeyValueHeap.java:268)
>       at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:815)
>       at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.seekAsDirection(StoreScanner.java:803)
>       at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:625)
>       at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:147)
>       at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.populateResult(HRegion.java:5736)
>       at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:5887)
>       at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:5674)
>       at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:2434)
>       at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32295)
>       at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2127)
>       at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:107)
>       at 
> org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133)
>       at org.apache.hadoop.hbase.ipc.RpcExecutor$2.run(RpcExecutor.java:108)
>       at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: foo bar baz
>       at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:207)
>       ... 17 more
>       at 
> org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1225)
>       at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:213)
>       at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:287)
>       at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.scan(ClientProtos.java:32741)
>       at 
> org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:214)
>       ... 9 more
> 2016-09-09 16:27:15,963 INFO  
> [B.priority.fifo.QRpcServer.handler=8,queue=0,port=51824] 
> regionserver.RSRpcServices(2196): scan request:region { type: REGION_NAME 
> value: "hbase:meta,,1" } scan { start_row: 
> "testScanThrowsException,,99999999999999" max_versions: 1 cache_blocks: true 
> small: true reversed: true caching: 1 } number_of_rows: 1 close_scanner: true 
> client_handles_partials: true client_handles_heartbeats: true
> 2016-09-09 16:27:15,965 INFO  
> [B.priority.fifo.QRpcServer.handler=8,queue=0,port=51824] 
> regionserver.RSRpcServices(2549): class 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ScanResponse:cells_per_result:
>  4 scanner_id: 15 more_results: false stale: false partial_flag_per_result: 
> false more_results_in_region: true heartbeat_message: false
> 2016-09-09 16:27:15,967 INFO  
> [B.fifo.QRpcServer.handler=0,queue=0,port=51833] 
> regionserver.RSRpcServices(2196): scan request:scanner_id: 1 number_of_rows: 
> 100 close_scanner: false next_call_seq: 0 client_handles_partials: true 
> client_handles_heartbeats: true renew: false
> 2016-09-09 16:27:15,967 INFO  
> [B.fifo.QRpcServer.handler=0,queue=0,port=51833] 
> regionserver.RSRpcServices(2549): class 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ScanResponse:scanner_id:
>  1 more_results: true ttl: 60000 stale: false more_results_in_region: false
> 2016-09-09 16:27:15,968 INFO  
> [B.fifo.QRpcServer.handler=1,queue=0,port=51833] 
> regionserver.RSRpcServices(2196): scan request:scanner_id: 1 number_of_rows: 
> 0 close_scanner: true client_handles_partials: true 
> client_handles_heartbeats: true
> 2016-09-09 16:27:15,968 INFO  
> [B.fifo.QRpcServer.handler=1,queue=0,port=51833] 
> regionserver.RSRpcServices(2549): class 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ScanResponse:scanner_id:
>  1 more_results: false
> 2016-09-09 16:27:15,971 INFO  [main] hbase.HBaseTestingUtility(1111): 
> Shutting down minicluster
> {code}
> The ScannerCallable is retryable if it gets a IOException. Before the 
> HBASE-13662 fix, the retry was still failing due to 
> OutOfOrderScannerException, which then causes the Scanner stack to be reset, 
> thus masking the issue. After HBASE-13662, we can retry the same scan, but in 
> case of we are doing a {{reseek()}} which fails in middle with an 
> IOException, the KVHeaps and the Scanner state on the server side is left 
> dirty. 
> It seems to be much safer to reset the scanner state in case of IOExceptions. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to