mwkang commented on code in PR #6900:
URL: https://github.com/apache/hbase/pull/6900#discussion_r2039655382
##########
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java:
##########
@@ -832,7 +845,7 @@ private void updateMetricsStore(boolean memstoreRead) {
* @return null is the top cell doesn't change. Otherwise, the NextState to
return
*/
private NextState needToReturn(List<? super ExtendedCell> outResult) {
- if (!outResult.isEmpty() && topChanged) {
+ if (topChanged) {
Review Comment:
Let’s assume the data stored in HBase is as follows:
(1) row0/family2:qf1/DeleteColumn
(2) row0/family2:qf1/Put/value2
(3) row1/family1:qf1/Put/value2
(4) row1/family2:qf1/Put/value2
Now, suppose a user starts scanning from **row0**.
In
[`RegionScannerImpl#nextInternal`](https://github.com/apache/hbase/blob/0b3c17302843d1f4d6f3c6b458f837cb9c274510/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RegionScannerImpl.java#L415),
when the [current
cell](https://github.com/apache/hbase/blob/0b3c17302843d1f4d6f3c6b458f837cb9c274510/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RegionScannerImpl.java#L446)’s
row is `row0`, after reading entry (2) in `StoreScanner`, if a flush happens,
a `topChanged` occurs ([`Storescanner.peek() is changed where before
...`](https://github.com/apache/hbase/blob/0b3c17302843d1f4d6f3c6b458f837cb9c274510/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java#L1074)),
and the value of `StoreScanner`’s `heap.peek()` becomes (4)
`row1/family2:qf1/Put/value2`.
Since it is the next row, `StoreScanner` *should* return at that point — but
it fails to recognize that it has moved to the next row, and ends up including
the new row in the result.
Then, in `RegionScannerImpl`, it sees that `nextKv`’s row is different from
the current cell’s row, and returns (since it has moved to a different row).
As a result, even though (3) and (4) belong to the same row (`row1`), they
are returned to the client as if they were from different rows.
(3) and (4) should be combined into a single
[Result](https://github.com/apache/hbase/blob/0b3c17302843d1f4d6f3c6b458f837cb9c274510/hbase-client/src/main/java/org/apache/hadoop/hbase/client/Result.java),
but they end up being returned as separate Result instances.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]