For information, i created https://issues.apache.org/jira/browse/HIVE-3699
On Tuesday 6 November 2012 at 21:40, Alexandre Fouche wrote: > Hi, > > I am doing a query with multiple INSERT OVERWRITE to multiple tables, in > order to scan the dataset only 1 time, and i end up having all these tables > with the same content ! It seems the GROUP BY query that returns results is > overwriting all the tables. > > This is the misbehaving query: > > FROM nikon > INSERT OVERWRITE TABLE e1 > SELECT qs_cs_s_aid AS Emplacements, COUNT(*) AS Impressions WHERE > qs_cs_s_cat='PRINT' GROUP BY qs_cs_s_aid > INSERT OVERWRITE TABLE e2 > SELECT qs_cs_s_aid AS Emplacements, COUNT(*) AS Vues WHERE > qs_cs_s_cat='VIEW' GROUP BY qs_cs_s_aid > ; > > It launches only one MR job and here are the results. Why does table 'e1' > contains results from table 'e2' ?! Table 'e1' should have been empty (see > individual SELECTs further below) > > hive> SELECT * from e1; > OK > NULL 2 > 1627575 25 > 1627576 70 > 1690950 22 > 1690952 42 > 1696705 199 > 1696706 66 > 1696730 229 > 1696759 85 > 1696893 218 > Time taken: 0.229 seconds > > hive> SELECT * from e2; > OK > NULL 2 > 1627575 25 > 1627576 70 > 1690950 22 > 1690952 42 > 1696705 199 > 1696706 66 > 1696730 229 > 1696759 85 > 1696893 218 > Time taken: 0.11 seconds > > > Here is are the result to the indiviual queries (only the second query > returns a result set): > > hive> SELECT qs_cs_s_aid AS Emplacements, COUNT(*) AS Impressions FROM > nikon WHERE qs_cs_s_cat='PRINT' GROUP BY qs_cs_s_aid; > (...) > OK > <- There are no results, this is normal > Time taken: 41.471 seconds > > hive> SELECT qs_cs_s_aid AS Emplacements, COUNT(*) AS Vues FROM nikon > WHERE qs_cs_s_cat='VIEW' GROUP BY qs_cs_s_aid; > (...) > OK > NULL 2 > 1627575 25 > 1627576 70 > 1690950 22 > 1690952 42 > 1696705 199 > 1696706 66 > 1696730 229 > 1696759 85 > 1696893 218 > Time taken: 39.607 seconds > >