Whether the behavior of StoreScanner during block skipping is as expected?

Lorin Lee Mon, 16 Sep 2024 11:26:09 -0700

Hi,

I am a newcomer to HBase, and I would like to ask about the behavior
of StoreScanner when scanning data. If the
`ScanQueryMatcher.compareKeyForNextColumn` receives a null ColumnHint,
the logic executed is completely the same as `compareKeyForNextRow`.
What is the benefit of doing this?


see: 
https://github.com/apache/hbase/blob/a3af60980c61fb4be31e0dcd89880f304d01098a/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/querymatcher/ScanQueryMatcher.java#L321

Imagine such a scenario: Block1 contains | r1/c1/t2,r1/c1/t1,
r1/c2/t2, r1/c2/t1, r1/c3/t1 |, and Block2 contains | r1/c4/t1,
r1/c5/t1 |. The read request is to scan the latest version of all
columns for all rows. Here, ScanWildcardColumnTracker is used. When
reading r1/c2/t1, since it exceeds the number of versions to be read
(having already read r1/c2/t2), ScanWildcardColumnTracker.checkVersion
will return SEEK_NEXT_COL, triggering
StoreScanner.trySkipToNextColumn, which in turn calls
ScanQueryMatcher.compareKeyForNextColumn. In this case, the ColumnHint
will obtain a null (from ScanWildcardColumnTracker). According to the
current implementation logic, it will trigger a seek operation.
However, the optimal choice is not to seek but to continue scanning.
But because StoreScanner.trySkipToNextColumn, when there is no
ColumnHint, will only compare the row key and not look at the column,
it results in an additional seek.

So I would like to ask, is this behavior as expected? In
ScanQueryMatcher.compareKeyForNextColumn, if the ColumnHint is null,
why not compare the nextIndexed and currentCell's column?

Thanks!

Best regards,
Lorin Lee

Whether the behavior of StoreScanner during block skipping is as expected?

Reply via email to