We have a Hbase table.
Each time we aggreate the table based on some columns, we are doing full
scan for entire table.
What are the ideas for extracting just the delta or increments frokm the
last loading .
Right now i m following this approach. But want some better ideas.
- Mount the hbase into
My rowkey contains reverseTimestamp ( Max value - current time stamp)
Example 9223370646332874562
select FROM_UNIXTIME ( unix_timestamp(
'9223370646332874562','MMddHHmmssSSS'))
> from HiveTest limit 1;
9226-01-07 22:34:42--obviously this wont give me right result as its
reverse tim
I am querying Hive table ( mapped to HBase Table ) .
What are the techniques to tune the Hive query and to avoid HBase scans.
Query uses multiple SPLIT and SUBSTR functions and WHERE condition
something like
select col1, col2, ...,count(*)
from hiveTable
where split( col1)[0] > timestamp1 an