Hi all,

We have a table in hive/HBase with a composite row key, the first field of which is a "bucket". Since the bucket is based on a hash, every query we have on our data needs to search through each bucket, and then to apply start and stop row filters within each bucket. The most efficient way to do this in HBase would be to specify multiple scans (one for each bucket), and apply the appropriate start and end row key to each (using for instance https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/MultiTableInputFormat.html). Has anybody used this sort of functionality/do people think this would be useful? If there's nothing extant, and people would find this helpful, I'd be more than happy to raise a JIRA and write a patch for hbase-handler (I already have one in progress).

Thanks!

Andrew

Reply via email to