Lenni Kuff created HIVE-6590:
--------------------------------

             Summary: Hive does not work properly with boolean partition 
columns (wrong results and inserts to incorrect HDFS path)
                 Key: HIVE-6590
                 URL: https://issues.apache.org/jira/browse/HIVE-6590
             Project: Hive
          Issue Type: Bug
          Components: Database/Schema
    Affects Versions: 0.10.0
            Reporter: Lenni Kuff


Hive does not work properly with boolean partition columns. Queries return 
wrong results and also insert to incorrect HDFS paths.

{code}
create table bool_part(int_col int) partitioned by(bool_col boolean);
# This works, creating 3 unique partitions!
ALTER TABLE bool_table ADD PARTITION (bool_col=FALSE);
ALTER TABLE bool_table ADD PARTITION (bool_col=false);
ALTER TABLE bool_table ADD PARTITION (bool_col=False);
{code}

The first problem is that Hive cannot filter on a bool partition key column. 
"select * from bool_part" returns the correct results, but if you apply a 
filter on the bool partition key column hive won't return any results.

The second problem is that Hive seems to just call "toString()" on the boolean 
literal value. This means you can end up with multiple partitions (FALSE, 
false, FaLSE, etc) mapping to the literal value 'FALSE'. For example, if you 
can add three partition in have for the same logic value "false" doing:
ALTER TABLE bool_table ADD PARTITION (bool_col=FALSE) -> 
/test-warehouse/bool_table/bool_col=FALSE/
ALTER TABLE bool_table ADD PARTITION (bool_col=false) -> 
/test-warehouse/bool_table/bool_col=false/
ALTER TABLE bool_table ADD PARTITION (bool_col=False) -> 
/test-warehouse/bool_table/bool_col=False/



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to