Can you try specifying an explicit name for the COUNT() column in the union 
(and any other columns that are not just plain columns already)?
I wonder if CBO is just generating a weird name for it that cannot be used in 
CTAS.

From: Premal Shah <premal.j.s...@gmail.com<mailto:premal.j.s...@gmail.com>>
Reply-To: "user@hive.apache.org<mailto:user@hive.apache.org>" 
<user@hive.apache.org<mailto:user@hive.apache.org>>
Date: Wednesday, November 9, 2016 at 23:16
To: "user@hive.apache.org<mailto:user@hive.apache.org>" 
<user@hive.apache.org<mailto:user@hive.apache.org>>
Subject: Re: Hive on Tez CTAS query breaks

Hi Gopal,
Thanx for the suggestion. It works with the setting you suggested.

What does this mean? Do I need to special case this query.

Also, I am trying different things to see what is breaking. Looks like I have a 
UNION ALL and both sides have a query with a GROUP BY.

This breaks.

CREATE TABLE unique_ip_tmp AS
SELECT DISTINCT
    new.ip
FROM
    (
        SELECT COUNT(0) , ip
        FROM t1
        WHERE dt BETWEEN '2016-11-08' AND '2016-11-08'
        GROUP BY ip

        UNION ALL

        SELECT COUNT(0) , ip
        FROM t2
        WHERE dt BETWEEN '2016-11-08' AND '2016-11-08'
        GROUP BY ip
    ) new
    LEFT JOIN unique_ip old
        ON old.ip = new.ip
WHERE
    old.ip IS NULL
;


If I remove one of the queries in the UNION, it works

CREATE TABLE unique_ip_tmp AS
SELECT DISTINCT
    new.ip
FROM
    (
        SELECT
            COUNT(0)
            , ip
        FROM
            map_activity
        WHERE
            dt BETWEEN '2016-11-08' AND '2016-11-08'
        GROUP BY
            ip
    ) new
    LEFT JOIN unique_ip old
        ON old.ip = new.ip
WHERE
    old.ip IS NULL
;


If I create tmp tables from the group by queries and use them, that works too

CREATE TABLE unique_ip_tmp AS
SELECT DISTINCT
    new.ip
FROM
    (
        SELECT * FROM dropme_t1
        UNION ALL
        SELECT * FROM dropme_t2
    ) new
    LEFT JOIN unique_ip old
        ON old.ip = new.ip
WHERE
    old.ip IS NULL
;


Turning off CBO cluster-wide won't be the right thing to do, would it?



On Wed, Nov 9, 2016 at 10:49 PM, Gopal Vijayaraghavan 
<gop...@apache.org<mailto:gop...@apache.org>> wrote:


> If I run a query with CREATE TABLE AS, it breaks with the error below. 
> However, just running the query works if I don't try to create a table from 
> the results. It does not happen to all CTAS queries.

Not sure if that's related to Tez at all.

Can try running it with

set hive.cbo.enable=false;

Cheers,
Gopal





--
Regards,
Premal Shah.

Reply via email to