Lars Francke created HIVE-3179: ---------------------------------- Summary: HBase Handler doesn't handle NULLs properly Key: HIVE-3179 URL: https://issues.apache.org/jira/browse/HIVE-3179 Project: Hive Issue Type: Bug Components: HBase Handler Affects Versions: 0.9.0 Reporter: Lars Francke Priority: Critical
We found a quite severe issue in the HBase Handler which actually means that Hive potentially returns incorrect data if a column has NULL values in HBase (which means the cell doesn't even exist) In HBase Shell: {noformat} create 'hive_hbase_test', 'test' put 'hive_hbase_test', '1', 'test:c1', 'c1-1' put 'hive_hbase_test', '1', 'test:c2', 'c2-1' put 'hive_hbase_test', '1', 'test:c3', 'c3-1' put 'hive_hbase_test', '2', 'test:c1', 'c1-2' {noformat} In Hive: {noformat} DROP TABLE IF EXISTS hive_hbase_test; CREATE EXTERNAL TABLE hive_hbase_test ( id int, c1 string, c2 string, c3 string ) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key#s,test:c1#s,test:c2#s,test:c3#s") TBLPROPERTIES("hbase.table.name" = "hive_hbase_test"); hive> select * from hive_hbase_test; OK 1 c1-1 c2-1 c3-1 2 c1-2 NULL NULL hive> select c1 from hive_hbase_test; c1-1 c1-2 hive> select c1, c2 from hive_hbase_test; c1-1 c2-1 c1-2 NULL {noformat} So far everything is correct but now: {noformat} hive> select c1, c2, c2 from hive_hbase_test; c1-1 c2-1 c2-1 c1-2 NULL c2-1 {noformat} Selecting c2 twice works the first time but the second time we actually get the value from the previous row. {noformat} hive> select c1, c3, c2, c2, c3, c3, c1 from hive_hbase_test; c1-1 c3-1 c2-1 c2-1 c3-1 c3-1 c1-1 c1-2 NULL NULL c2-1 c3-1 c3-1 c1-2 {noformat} We've narrowed this down to an early initialization of {{fieldsInited[fieldID] = true;}} in {{LazyHBaseRow#uncheckedGetField}} and we'll try to provide a patch which surely needs review. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira