[ https://issues.apache.org/jira/browse/HIVE-10931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Furcy Pin updated HIVE-10931: ----------------------------- Description: The following set of queries : {code:sql} DROP TABLE IF EXISTS test1 ; DROP TABLE IF EXISTS test2 ; DROP TABLE IF EXISTS test3 ; CREATE TABLE test1 (col1 INT, col2 STRING, col3 STRING, coL4 STRING, coL5 STRING, col6 STRING) ; INSERT INTO TABLE test1 VALUES (1,NULL,NULL,NULL,NULL,"A") ; CREATE TABLE test2 (col1 INT, col2 STRING, col3 STRING, coL4 STRING, coL5 STRING, col6 STRING) ; INSERT INTO TABLE test2 VALUES (1,NULL,NULL,NULL,NULL,"X") ; CREATE TABLE test3 (coL1 STRING) ; INSERT INTO TABLE test3 VALUES ("A") ; SELECT T2.val FROM test1 T1 LEFT JOIN (SELECT col1, col2, col3, col4, col5, COALESCE(col6,"") as val FROM test2) T2 ON T2.col1 = T1.col1 LEFT JOIN test3 T3 ON T3.col1 = T1.col6 ; {code} will return this : {noformat} +----------+--+ | t2.val | +----------+--+ | A | +----------+--+ {noformat} Obviously, this result is wrong as table `test2` contains a "X" and no "A". This is the most minimal example we found of this issue, in particular having less than 6 columns in the tables will work, for instance : {code:sql} SELECT T2.val FROM test1 T1 LEFT JOIN (SELECT col1, col2, col3, col4, COALESCE(col6,"") as val FROM test2) T2 ON T2.col1 = T1.col1 LEFT JOIN test3 T3 ON T3.col1 = T1.col6 ; {code} (same query as before, but `col5` was removed from the select) will return : {noformat} +----------+--+ | t2.val | +----------+--+ | X | +----------+--+ {noformat} Removing the `COALESCE` also removes the bug... was: The following set of queries : {code:sql} DROP TABLE IF EXISTS test1 ; DROP TABLE IF EXISTS test2 ; DROP TABLE IF EXISTS test3 ; CREATE TABLE test1 (col1 INT, col2 STRING, col3 STRING, coL4 STRING, coL5 STRING, col6 STRING) ; INSERT INTO TABLE test1 VALUES (1,NULL,NULL,NULL,NULL,"A") ; CREATE TABLE test2 (col1 INT, col2 STRING, col3 STRING, coL4 STRING, coL5 STRING, col6 STRING) ; INSERT INTO TABLE test2 VALUES (1,NULL,NULL,NULL,NULL,"X") ; CREATE TABLE test3 (coL1 STRING) ; INSERT INTO TABLE test3 VALUES ("A") ; SELECT T2.val FROM test1 T1 LEFT JOIN (SELECT col1, col2, col3, col4, col5, COALESCE(col6,"") as val FROM test2) T2 ON T2.col1 = T1.col1 LEFT JOIN test3 T3 ON T3.col1 = T1.col6 ; {code} will return this : {noformat} +----------+--+ | t2.val | +----------+--+ | A | +----------+--+ {noformat} Obviously, this result is wrong as table `test2` contains a "X" and no "A". This is the most minimal example we found of this issue, in particular having less than 6 columns in the tables will work, for instance : {code:sql} SELECT T2.val FROM test1 T1 LEFT JOIN (SELECT col1, col2, col3, col4, COALESCE(col6,"") as val FROM test2) T2 ON T2.col1 = T1.col1 LEFT JOIN test3 T3 ON T3.col1 = T1.col6 ; {code:sql} (same query as before, but `col5` was removed from the select) will return : {noformat} +----------+--+ | t2.val | +----------+--+ | X | +----------+--+ {noformat} Removing the `COALESCE` also removes the bug... > Wrong columns selected on multiple joins > ---------------------------------------- > > Key: HIVE-10931 > URL: https://issues.apache.org/jira/browse/HIVE-10931 > Project: Hive > Issue Type: Bug > Affects Versions: 1.1.0 > Environment: Cloudera cdh5.4.2 > Reporter: Furcy Pin > > The following set of queries : > {code:sql} > DROP TABLE IF EXISTS test1 ; > DROP TABLE IF EXISTS test2 ; > DROP TABLE IF EXISTS test3 ; > CREATE TABLE test1 (col1 INT, col2 STRING, col3 STRING, coL4 STRING, coL5 > STRING, col6 STRING) ; > INSERT INTO TABLE test1 VALUES (1,NULL,NULL,NULL,NULL,"A") ; > CREATE TABLE test2 (col1 INT, col2 STRING, col3 STRING, coL4 STRING, coL5 > STRING, col6 STRING) ; > INSERT INTO TABLE test2 VALUES (1,NULL,NULL,NULL,NULL,"X") ; > CREATE TABLE test3 (coL1 STRING) ; > INSERT INTO TABLE test3 VALUES ("A") ; > SELECT > T2.val > FROM test1 T1 > LEFT JOIN (SELECT col1, col2, col3, col4, col5, COALESCE(col6,"") as val > FROM test2) T2 > ON T2.col1 = T1.col1 > LEFT JOIN test3 T3 > ON T3.col1 = T1.col6 > ; > {code} > will return this : > {noformat} > +----------+--+ > | t2.val | > +----------+--+ > | A | > +----------+--+ > {noformat} > Obviously, this result is wrong as table `test2` contains a "X" and no "A". > This is the most minimal example we found of this issue, in particular > having less than 6 columns in the tables will work, for instance : > {code:sql} > SELECT > T2.val > FROM test1 T1 > LEFT JOIN (SELECT col1, col2, col3, col4, COALESCE(col6,"") as val FROM > test2) T2 > ON T2.col1 = T1.col1 > LEFT JOIN test3 T3 > ON T3.col1 = T1.col6 > ; > {code} > (same query as before, but `col5` was removed from the select) > will return : > {noformat} > +----------+--+ > | t2.val | > +----------+--+ > | X | > +----------+--+ > {noformat} > Removing the `COALESCE` also removes the bug... -- This message was sent by Atlassian JIRA (v6.3.4#6332)