Hi,

I tried to do aggregation based on Table impressions2, and then need to save 
the results to two different local files (or tables).
I tried three methods, only the first one succeeded:

1) create a new table and store aggregation results to it, and then use 
multi-insert to split the results to local disk:
create table tmp__imp as select requestbegintime, count(*) from impressions2 
where

requestbegintime<'1239572996000';
from tmp__imp
insert OVERWRITE LOCAL DIRECTORY '/disk2/is1' select * where 
requestbegintime<'1239572956000'
insert OVERWRITE LOCAL DIRECTORY '/disk2/is2' select * where 
requestbegintime>='1239572956000';

The method works, and #rows of each local file:
40 Rows loaded to /disk2/is2
31057 Rows loaded to /disk2/is1

2) use one statement without create new table:
from (select requestbegintime, count(*) as ct from impressions2 where 
requestbegintime<'1239572996000' GROUP BY requestbegintime)mt
insert OVERWRITE LOCAL DIRECTORY '/disk2/is1' select  mt.requestbegintime,mt.ct 
 where mt.requestbegintime<'1239572956000'
insert OVERWRITE LOCAL DIRECTORY '/disk2/is2' select  mt.requestbegintime,mt.ct 
 where mt.requestbegintime>='1239572956000';

But all aggregated results are copies to both files:
31097 Rows loaded to /disk2/is2
31097 Rows loaded to /disk2/is1

3) create a view and store aggregation results to it, and then use multi-insert 
to split the results to local disk:
create view view_imp(rt, ct) as select requestbegintime, count(*) as ct from 
impressions2 where requestbegintime<'1239572996000' GROUP BY requestbegintime;
from view_imp
insert OVERWRITE LOCAL DIRECTORY '/disk2/is1' select * where rt<'1239572956000'
insert OVERWRITE LOCAL DIRECTORY '/disk2/is2' select * where 
rt>='1239572956000';

But, again, all aggregated results are copies to both files:
31097 Rows loaded to /disk2/is2
31097 Rows loaded to /disk2/is1

Why does 2) and 3) fail? How can I split the results without creating new table?


Regards,
Wei

Reply via email to