Re: Hive Vs Pig: Master's thesis

2014-05-02 Thread Thejas Nair
The primary difference between hive and pig is the language. There are implementation differences that will result in performance differences, but it will be hard to figure out what aspect of implementation responsible for what improvement. I think a more interesting project would be to compare th

Re: MapReduce with HCatalog hangs

2014-05-02 Thread Thejas Nair
HcatInputFormat does not run any initial mapreduce jobs. It seems to me that the MapReduce job actually ran. You might want to do a jstack on your java program client side, to see what it is waiting on. On Fri, May 2, 2014 at 7:28 AM, Fabian Reinartz wrote: > I implemented a MapReduce job with H

Re: Alter non partitioned table into partitioned table

2014-05-02 Thread Hamza Asad
set these value before running insert overwrite command (execute in hive console) *set hive.exec.dynamic.partition=true;set hive.exec.dynamic.partition.mode=nonstrict;* On Fri, May 2, 2014 at 6:18 PM, Kishore kumar wrote: > Hi Experts, > > How to change the non partitioned table into partit

Re: Alter non partitioned table into partitioned table

2014-05-02 Thread Hamza Asad
set these value before running insert overwrite command (execute in hive console) *set hive.exec.dynamic.partition=true;set hive.exec.dynamic.partition.mode=nonstrict;* On Fri, May 2, 2014 at 6:18 PM, Kishore kumar wrote: > Hi Experts, > > How to change the non partitioned table into partit

MapReduce with HCatalog hangs

2014-05-02 Thread Fabian Reinartz
I implemented a MapReduce job with HCatalog as input and output. It's pretty much the same as the example on the website. If I start my job with `hadoop jar` an initial MapReduce is performed (which, I guess is the query for the HCatalog data as the setup method in my mapper is not executed). Afte

Re: data transfer from rdbms to hive

2014-05-02 Thread CRAIG LIU
I am new in hive and here is my idea? 1. Use mysqldump to dump your data to csv file. 2. Load csv to hive temp table. 3. Create partition table. 4. Use dynamic partition, select from temp table to insert to partition table. You can use udf to get the date from the timestamp. Regards, Craig 2014-5-

Alter non partitioned table into partitioned table

2014-05-02 Thread Kishore kumar
Hi Experts, How to change the non partitioned table into partitioned table in hive. I created a table with create table table_name1(col1 type, col2 type...) row format fields terminated by '|' stored as textfile loaded data from local with load data local inpath "/to/path" (overwrite)into

Re: data transfer from rdbms to hive

2014-05-02 Thread Shushant Arora
for that do i need to load files first in non partitioned table and then in from there to partitioned table use insert from unpartitioned table to partitioned one. On Fri, May 2, 2014 at 4:04 PM, Hamza Asad wrote: > Sqoop also support dynamic partitioning. I have done that. For that you > have

Re: data transfer from rdbms to hive

2014-05-02 Thread Matt Tucker
It sounds like you might need to export. Via sqoop using a query or view, as the date granularity in your MySQL table is different from the desired Hive table. The overall performance may be lower as MySQL must do more than just read rows from disk, but you may still find ways to get the data in pa

largest table last in joins

2014-05-02 Thread Aleksei Udatšnõi
Hello, There is this old recommendation for optimizing Hive join to use the largest table last in the join. http://archive.cloudera.com/cdh/3/hive/language_manual/joins.html The same recommendation appears in Programming Hive book. Is this recommendation still valid or newer version of Hive take

Re: data transfer from rdbms to hive

2014-05-02 Thread Hamza Asad
Sqoop also support dynamic partitioning. I have done that. For that you have to enable dynamic partition i.e dynamic partition = true, in hive. On Fri, May 2, 2014 at 12:57 PM, unmesha sreeveni wrote: > > On Fri, May 2, 2014 at 9:41 AM, Shushant Arora > wrote: > >> Sqoop > > > ​Hi Shushant >

Re: data transfer from rdbms to hive

2014-05-02 Thread unmesha sreeveni
On Fri, May 2, 2014 at 9:41 AM, Shushant Arora wrote: > Sqoop ​Hi Shushant I dont think other ecosystem projects can help you.The only way to import data from relational DB is SQOOP. http://my.safaribooksonline.com/book/databases/9781449364618/6dot-hadoop-ecosystem-integration/integration_hiv