Hi, Mark

Query 1 is:
1) create table tmp__imp as select requestbegintime, count(*) from impressions2 
where requestbegintime<'1239572996000'; 

from tmp__imp 
insert OVERWRITE LOCAL DIRECTORY '/disk2/is1' select * where 
requestbegintime<'1239572956000' 
insert OVERWRITE LOCAL DIRECTORY '/disk2/is2' select * where 
requestbegintime>='1239572956000';

Do you have any idea about the failure of query 2 and 3? Is the failure caused 
by some issues of Hive?


Regards,
Wei



-----Original Message-----
From: Mark Grover [mailto:mgro...@oanda.com] 
Sent: Wednesday, March 07, 2012 1:11 AM
To: user@hive.apache.org
Subject: Re: how to split query result into several smaller tables without 
creating temp table??

Hi Wei,
In query 1, it's invalid to requestbegintime in the select list if it's not in 
the group by clause. There doesn't seem to be a group by clause there. Is that 
the right query?

Mark

Mark Grover, Business Intelligence Analyst
OANDA Corporation 

www: oanda.com www: fxtrade.com 

"Best Trading Platform" - World Finance's Forex Awards 2009. 
"The One to Watch" - Treasury Today's Adam Smith Awards 2009. 


----- Original Message -----
From: "Wei Lu" <w...@microstrategy.com>
To: user@hive.apache.org
Sent: Tuesday, March 6, 2012 4:40:00 AM
Subject: how to split query result into several smaller tables without creating 
temp table??




Hi, 



I tried to do aggregation based on Table impressions2, and then need to save 
the results to two different local files (or tables). 

I tried three methods, only the first one succeeded: 



1) create a new table and store aggregation results to it, and then use 
multi-insert to split the results to local disk: 

create table tmp__imp as select requestbegintime, count(*) from impressions2 
where 



requestbegintime<'1239572996000'; 

from tmp__imp 

insert OVERWRITE LOCAL DIRECTORY '/disk2/is1' select * where 
requestbegintime<'1239572956000' 

insert OVERWRITE LOCAL DIRECTORY '/disk2/is2' select * where 
requestbegintime>='1239572956000'; 



The method works, and #rows of each local file: 

40 Rows loaded to /disk2/is2 

31057 Rows loaded to /disk2/is1 



2) use one statement without create new table: 

from (select requestbegintime, count(*) as ct from impressions2 where 
requestbegintime<'1239572996000' GROUP BY requestbegintime)mt 

insert OVERWRITE LOCAL DIRECTORY '/disk2/is1' select mt.requestbegintime,mt.ct 
where mt.requestbegintime<'1239572956000' 

insert OVERWRITE LOCAL DIRECTORY '/disk2/is2' select mt.requestbegintime,mt.ct 
where mt.requestbegintime>='1239572956000'; 



But all aggregated results are copies to both files: 

31097 Rows loaded to /disk2/is2 

31097 Rows loaded to /disk2/is1 



3) create a view and store aggregation results to it, and then use multi-insert 
to split the results to local disk: 

create view view_imp(rt, ct) as select requestbegintime, count(*) as ct from 
impressions2 where requestbegintime<'1239572996000' GROUP BY requestbegintime; 

from view_imp 

insert OVERWRITE LOCAL DIRECTORY '/disk2/is1' select * where rt<'1239572956000' 

insert OVERWRITE LOCAL DIRECTORY '/disk2/is2' select * where 
rt>='1239572956000'; 



But, again, all aggregated results are copies to both files: 

31097 Rows loaded to /disk2/is2 

31097 Rows loaded to /disk2/is1 



Why does 2) and 3) fail? How can I split the results without creating new 
table? 





Regards, 

Wei 

Reply via email to