[
https://issues.apache.org/jira/browse/HBASE-19988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16362989#comment-16362989
]
Umesh Agashe commented on HBASE-19988:
--------------------------------------
It was logging following exception... several times!
{code:java}
2018-02-10 04:24:25,503 WARN [PutThread] regionserver.HRegion(5636): Thread
interrupted waiting for lock on row: row0
2018-02-10 04:24:25,503 WARN [PutThread]
regionserver.HRegion$BatchOperation(3173): Failed getting lock, row=row0
java.io.InterruptedIOException
at
org.apache.hadoop.hbase.regionserver.HRegion.getRowLockInternal(HRegion.java:5637)
at
org.apache.hadoop.hbase.regionserver.HRegion$BatchOperation.lockRowsAndBuildMiniBatch(HRegion.java:3168)
at
org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutate(HRegion.java:3837)
at org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:3810)
at org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:3741)
at org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:3732)
at org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:3746)
at org.apache.hadoop.hbase.regionserver.HRegion.doBatchMutate(HRegion.java:4074)
at org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:2925)
at
org.apache.hadoop.hbase.regionserver.TestHRegion$PutThread.run(TestHRegion.java:3891)
Caused by: java.lang.InterruptedException
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1326)
at
java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.tryLock(ReentrantReadWriteLock.java:871)
at
org.apache.hadoop.hbase.regionserver.HRegion.getRowLockInternal(HRegion.java:5621)
... 9 more{code}
There is a loop in the write batch path:
{code:java}
while (!batchOp.isDone()) {
doMiniBatchMutate(batchOp);
}{code}
This loop essentially, tries to acquire locks on as many rows in a batch as
possible and creates a mini-batch of those rows to write. Next time, locks are
acquired from last row (row for which previous iteration failed to acquire a
lock) on till the entire batch is written.
The operation was aborted/ stopped only on Timeout exception. All other
exceptions were logged and ignored to resume creating and writing mini-batches
for an input batch.
In this particular case, getRowLockInternal() used to fail with exception
InterruptedIOException caused by surefire (possibly due to test timeout). This
exception was ignored to proceed with write operation containing locked rows so
far. This was causing continuous calls to doMinibatchMutate() in a loop,
filling up the logs.
> HRegion#lockRowsAndBuildMiniBatch() is too chatty when interrupted while
> waiting for a row lock
> -----------------------------------------------------------------------------------------------
>
> Key: HBASE-19988
> URL: https://issues.apache.org/jira/browse/HBASE-19988
> Project: HBase
> Issue Type: Improvement
> Components: amv2
> Affects Versions: 2.0.0-beta-1
> Reporter: Umesh Agashe
> Assignee: Umesh Agashe
> Priority: Minor
> Fix For: 2.0.0-beta-2
>
> Attachments: hbase-19988.master.001.patch
>
>
> See HBASE-19970, TestHRegionWithInMemoryFlush created 4.2g log file.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)