No.
First, I apologize for my first response. I guess its never a good idea to
check email at 4:00 in the morning before your first cup of coffee. ;-)
I went into a bit more detail that may have confused the issue.
To answer your question…
In other words Is querying over plain hive (ORC or Text
Think about it like this one system is scanning a local file ORC, using an
hbase scanner (over the network), and scanning the data in sstable format?
On Fri, Jun 9, 2017 at 5:50 AM, Amey Barve wrote:
> Hi Michael,
>
> "If there is predicate pushdown, then you will be faster, assuming that
> the
Hi Michael,
"If there is predicate pushdown, then you will be faster, assuming that the
query triggers an implied range scan"
---> Does this bring results faster than plain hive querying over ORC /
Text file formats
In other words Is querying over plain hive (ORC or Text) *always* faster
than thr
The pro’s is that you have the ability to update a table without having to
worry about duplication of the row. Tez is doing some form of compaction for
you that already exists in HBase.
The cons:
1) Its slower. Reads from HBase have more overhead with them than just reading
a file. Read Lar
Why are you thinking of using HBase?
Just store the SCD versions in a normal Hive dimension table. In case
you are worried about updates to columns such as 'valid to' and 'latest
record indicator' you can calculate these on the fly using window
functions. No need to create and update them phys
As I know using Hive on Hbase can only be done through Hive
Example
hive> create external table MARKETDATAHBASE (key STRING, TICKER STRING,
TIMECREATED STRING, PRICE STRING)
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'WITH
SERDEPROPERTIES ("hbase.columns.mapping" =
":key,PRI