Hi Wei,
In query 1, it's invalid to requestbegintime in the select list if it's not in 
the group by clause. There doesn't seem to be a group by clause there. Is that 
the right query?

Mark

Mark Grover, Business Intelligence Analyst
OANDA Corporation 

www: oanda.com www: fxtrade.com 

"Best Trading Platform" - World Finance's Forex Awards 2009. 
"The One to Watch" - Treasury Today's Adam Smith Awards 2009. 


----- Original Message -----
From: "Wei Lu" <w...@microstrategy.com>
To: user@hive.apache.org
Sent: Tuesday, March 6, 2012 4:40:00 AM
Subject: how to split query result into several smaller tables without creating 
temp table??




Hi, 



I tried to do aggregation based on Table impressions2, and then need to save 
the results to two different local files (or tables). 

I tried three methods, only the first one succeeded: 



1) create a new table and store aggregation results to it, and then use 
multi-insert to split the results to local disk: 

create table tmp__imp as select requestbegintime, count(*) from impressions2 
where 



requestbegintime<'1239572996000'; 

from tmp__imp 

insert OVERWRITE LOCAL DIRECTORY '/disk2/is1' select * where 
requestbegintime<'1239572956000' 

insert OVERWRITE LOCAL DIRECTORY '/disk2/is2' select * where 
requestbegintime>='1239572956000'; 



The method works, and #rows of each local file: 

40 Rows loaded to /disk2/is2 

31057 Rows loaded to /disk2/is1 



2) use one statement without create new table: 

from (select requestbegintime, count(*) as ct from impressions2 where 
requestbegintime<'1239572996000' GROUP BY requestbegintime)mt 

insert OVERWRITE LOCAL DIRECTORY '/disk2/is1' select mt.requestbegintime,mt.ct 
where mt.requestbegintime<'1239572956000' 

insert OVERWRITE LOCAL DIRECTORY '/disk2/is2' select mt.requestbegintime,mt.ct 
where mt.requestbegintime>='1239572956000'; 



But all aggregated results are copies to both files: 

31097 Rows loaded to /disk2/is2 

31097 Rows loaded to /disk2/is1 



3) create a view and store aggregation results to it, and then use multi-insert 
to split the results to local disk: 

create view view_imp(rt, ct) as select requestbegintime, count(*) as ct from 
impressions2 where requestbegintime<'1239572996000' GROUP BY requestbegintime; 

from view_imp 

insert OVERWRITE LOCAL DIRECTORY '/disk2/is1' select * where rt<'1239572956000' 

insert OVERWRITE LOCAL DIRECTORY '/disk2/is2' select * where 
rt>='1239572956000'; 



But, again, all aggregated results are copies to both files: 

31097 Rows loaded to /disk2/is2 

31097 Rows loaded to /disk2/is1 



Why does 2) and 3) fail? How can I split the results without creating new 
table? 





Regards, 

Wei 

Reply via email to