Hi all,
We have a table in hive/HBase with a composite row key, the first field
of which is a "bucket". Since the bucket is based on a hash, every query
we have on our data needs to search through each bucket, and then to
apply start and stop row filters within each bucket. The most efficient
way to do this in HBase would be to specify multiple scans (one for each
bucket), and apply the appropriate start and end row key to each (using
for instance
https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/MultiTableInputFormat.html).
Has anybody used this sort of functionality/do people think this would
be useful? If there's nothing extant, and people would find this
helpful, I'd be more than happy to raise a JIRA and write a patch for
hbase-handler (I already have one in progress).
Thanks!
Andrew
- hbase-handler predicate pushdown with multiple scans Andrew Mains
-