Mapping Existing Hbase table to Hive will be better or Creating direct Hive
tables will be better ?
I am reiterating 2 scenarios
I have a requirement of processing huge weblogs on daily basis.
Scenario 1. data will come incremental to datastore (containing
timestamp,userid,operation performed) o
HDFS lacks random read and write accees. This is where HBase comes into
picture.It stores data as key/value pairs.
Hive provides data warehousing facilities on top of an existing Hadoop
cluster. It provides an SQL like interface which makes your work easier.
You can create tables in Hive and store
I have a requirement of processing huge weblogs on daily basis.
1. data will come incremental to datastore on daily basis and I need
cumulative and daily
distinct user count from logs and after that aggregated data will be loaded
in RDBMS like mydql.
2.data will be loaded in hdfs datawarehouse o