[
https://issues.apache.org/jira/browse/NIFI-3639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15939835#comment-15939835
]
ASF GitHub Bot commented on NIFI-3639:
--------------------------------------
Github user baolsen commented on the issue:
https://github.com/apache/nifi/pull/1615
Hi @bbende, my pleasure!
Thanks for the quick response.
The intention was to use the Get for another processor I am writing which
just needs single row lookups (didn't at first realise that FetchHBaseRow was
also doing single row lookups).
I had assumed that a Get would be more efficient than a Scan for fetching
single rows.
However, upon further reading it seems that the HBase client API uses a
Scan implementation for Gets as well.
https://www.cloudera.com/documentation/enterprise/5-4-x/topics/admin_hbase_scanning.html
There are some stack overflow questions regarding Get performance being
poorer than Scan, especially when using a key prefix in the scan as opposed to
a full rowkey.
https://www.quora.com/What-is-the-difference-between-get-and-scan-in-HBase
It's a little unclear what scenarios cause this performance difference, or
whether one approach is more performant in general eg. when the rowkey is a
full rowkey as in our case.
In summary, seeing as the HBase client API uses a Scanner under the hood
when doing a Get, there should be no real benefit to having a Get added to the
code (at least not without doing some practical benchmarks).
I'll use the Scan for my processor instead since it already has the
functionality I need. Will add the processor as a separate PR.
Can I close this PR, or does that need to be done on your side?
Also, I see that the automated checks have failed (looks like other
components' tests). Is this something I should worry about for my next PR? :)
> Add HBase Get to HBase_1_1_2_ClientService
> ------------------------------------------
>
> Key: NIFI-3639
> URL: https://issues.apache.org/jira/browse/NIFI-3639
> Project: Apache NiFi
> Issue Type: Improvement
> Components: Extensions
> Reporter: Bjorn Olsen
> Priority: Trivial
>
> Enhance HBase_1_1_2_ClientService and API to provide HBase Get functionality.
> Currently only Put and Scan are supported.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)