Re: data transfer from rdbms to hive

2014-05-01 Thread Shushant Arora
But how to achieve dynamic partitioning. For each row in mysql date from column get partition name and insert in corresponding partition in hive. Sqoop requires partition t be told beforehand. On Fri, May 2, 2014 at 8:36 AM, unmesha sreeveni wrote: > I suggest you to go for sqoop - They import

[Blog] How to create tables in Hive

2014-05-01 Thread unmesha sreeveni
Hi http://www.unmeshasreeveni.blogspot.in/2014/04/how-to-create-tables-in-hive.html This is a blog for creating tables in Hive for beginners Please post your comments for the same. Let me know your thoughts. -- *Thanks & Regards * *Unmesha Sreeveni U.B* *Hadoop, Bigdata Developer* *Center fo

Re: data transfer from rdbms to hive

2014-05-01 Thread unmesha sreeveni
I suggest you to go for sqoop - They imports data from RDBMS. On Thu, May 1, 2014 at 7:13 PM, Shushant Arora wrote: > Hi > > I have a requirement to transfer data from RDBMS mysql to partitioned hive > table > Partitioned on Year and month. > Each record in mysql data contains timestamp of user

Re: hbase importtsv

2014-05-01 Thread Amit Tewari
Make sure there are no primary key clash. HBase would over write the row if you upload data with same primary key. That's one reason you can possibly get less rows than what you uploaded Sent from my mobile device, please excuse the typos > On May 1, 2014, at 3:34 PM, "Kennedy, Sean C." wro

hbase importtsv

2014-05-01 Thread Kennedy, Sean C.
I ran the following command to import an excel.csv file into hbase. Everything looked ok however when I ran a scan on the table in hbase I did not see as many rows as were in excel.csv file. Any help appreciated /hd/hadoop/bin/hadoop jar /hbase/hbase-0.94.15/hbase-0.94.15.jar importtsv '

data transfer from rdbms to hive

2014-05-01 Thread Shushant Arora
Hi I have a requirement to transfer data from RDBMS mysql to partitioned hive table Partitioned on Year and month. Each record in mysql data contains timestamp of user activity. What is the best tool for that. 1.Shall I go with sqoop? 2.How to compute dynamic partition from RDBMS data . Shall

Re: Hive Vs Pig: Master's thesis

2014-05-01 Thread Sarfraz Ramay
> > > Hi, > > It seems that both Hive and Pig are used for managing large data sets. > Hive is more SQL oriented whereas Pig is more for the data flows. I am > doing a master's thesis on the performance evaluation of both. Can some > please provide a list of tasks that would make for an interesting