Can you try specifying an explicit name for the COUNT() column in the union (and any other columns that are not just plain columns already)? I wonder if CBO is just generating a weird name for it that cannot be used in CTAS.
From: Premal Shah <premal.j.s...@gmail.com<mailto:premal.j.s...@gmail.com>> Reply-To: "user@hive.apache.org<mailto:user@hive.apache.org>" <user@hive.apache.org<mailto:user@hive.apache.org>> Date: Wednesday, November 9, 2016 at 23:16 To: "user@hive.apache.org<mailto:user@hive.apache.org>" <user@hive.apache.org<mailto:user@hive.apache.org>> Subject: Re: Hive on Tez CTAS query breaks Hi Gopal, Thanx for the suggestion. It works with the setting you suggested. What does this mean? Do I need to special case this query. Also, I am trying different things to see what is breaking. Looks like I have a UNION ALL and both sides have a query with a GROUP BY. This breaks. CREATE TABLE unique_ip_tmp AS SELECT DISTINCT new.ip FROM ( SELECT COUNT(0) , ip FROM t1 WHERE dt BETWEEN '2016-11-08' AND '2016-11-08' GROUP BY ip UNION ALL SELECT COUNT(0) , ip FROM t2 WHERE dt BETWEEN '2016-11-08' AND '2016-11-08' GROUP BY ip ) new LEFT JOIN unique_ip old ON old.ip = new.ip WHERE old.ip IS NULL ; If I remove one of the queries in the UNION, it works CREATE TABLE unique_ip_tmp AS SELECT DISTINCT new.ip FROM ( SELECT COUNT(0) , ip FROM map_activity WHERE dt BETWEEN '2016-11-08' AND '2016-11-08' GROUP BY ip ) new LEFT JOIN unique_ip old ON old.ip = new.ip WHERE old.ip IS NULL ; If I create tmp tables from the group by queries and use them, that works too CREATE TABLE unique_ip_tmp AS SELECT DISTINCT new.ip FROM ( SELECT * FROM dropme_t1 UNION ALL SELECT * FROM dropme_t2 ) new LEFT JOIN unique_ip old ON old.ip = new.ip WHERE old.ip IS NULL ; Turning off CBO cluster-wide won't be the right thing to do, would it? On Wed, Nov 9, 2016 at 10:49 PM, Gopal Vijayaraghavan <gop...@apache.org<mailto:gop...@apache.org>> wrote: > If I run a query with CREATE TABLE AS, it breaks with the error below. > However, just running the query works if I don't try to create a table from > the results. It does not happen to all CTAS queries. Not sure if that's related to Tez at all. Can try running it with set hive.cbo.enable=false; Cheers, Gopal -- Regards, Premal Shah.