Boolean columns in Hive tables containing NULL are treated as FALSE by the Hive 
JDBC driver.
--------------------------------------------------------------------------------------------

                 Key: HIVE-1863
                 URL: https://issues.apache.org/jira/browse/HIVE-1863
             Project: Hive
          Issue Type: Bug
          Components: Drivers
    Affects Versions: 0.5.0
            Reporter: Guy le Mar


(1) Using the Hive CLI, create a table using...
create table dt4_boolean
(
        dt4_id          int,
        dt4_testbool    boolean,
        dt4_string      string
)
row format delimited
        fields  terminated by ','
        lines  terminated by '\n';

(2) Create a file containing the following text...
1,true,Value is True
2,null,Data says null and must be null
3,,No value that means null
4,NoIdea,Data says NoIdea that's gonna be null
5,false,Value is FALSE

(3) Load the data in the file into the Hive table...
load data local inpath '<DATA FILE PATH>' overwrite into table dt4_boolean;

(4) Check the table works as expected using the Hive CLI...
hive> select * from dt4_boolean;
OK
1       true    Value is True
2       NULL    Data says null and must be null
3       NULL    No value that means null
4       NULL    Data says NoIdea that's gonna be null
5       false   Value is FALSE
Time taken: 0.049 seconds

(5) Using the Hive JDBC driver, execute the same Hive query (select * from 
dt4_boolean)
(5.1) The "row_str" values obtained by the Hive JDBC driver for deserialization 
are correct...
1       true    Value is True
2       NULL    Data says null and must be null
3       NULL    No value that means null
4       NULL    Data says NoIdea that's gonna be null
5       false   Value is FALSE

(5.2) However, when these "row_str" are deserialized by the DynamicSerDe to a 
java.lang.Object, the NULL boolean values are converted to FALSE - instead of 
being null.

As a consequence, the application making use of the Hive JDBC driver produces 
this (incorrect) output...

SQL> select dt4_id, dt4_testbool from dt4_boolean;

    DT4_ID DT4_TESTBOOL
---------- ------------
         1            true
         2            false
         3            false
         4            false
         5            false

...instead of producing this (correct) output...
SQL> select dt4_id, dt4_testbool from dt4_boolean;

    DT4_ID DT4_TESTBOOL
---------- ------------
         1            true
         2            NULL
         3            NULL
         4            NULL
         5            false


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to