[ https://issues.apache.org/jira/browse/HIVE-3179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13574553#comment-13574553 ]
Mark Grover commented on HIVE-3179: ----------------------------------- Running TestHBaseCliDriver tests... > HBase Handler doesn't handle NULLs properly > ------------------------------------------- > > Key: HIVE-3179 > URL: https://issues.apache.org/jira/browse/HIVE-3179 > Project: Hive > Issue Type: Bug > Components: HBase Handler > Affects Versions: 0.9.0, 0.10.0 > Reporter: Lars Francke > Priority: Critical > Attachments: HIVE-3179.1.patch > > > We found a quite severe issue in the HBase Handler which actually means that > Hive potentially returns incorrect data if a column has NULL values in HBase > (which means the cell doesn't even exist) > In HBase Shell: > {noformat} > create 'hive_hbase_test', 'test' > put 'hive_hbase_test', '1', 'test:c1', 'c1-1' > put 'hive_hbase_test', '1', 'test:c2', 'c2-1' > put 'hive_hbase_test', '1', 'test:c3', 'c3-1' > put 'hive_hbase_test', '2', 'test:c1', 'c1-2' > {noformat} > In Hive: > {noformat} > DROP TABLE IF EXISTS hive_hbase_test; > CREATE EXTERNAL TABLE hive_hbase_test ( > id int, > c1 string, > c2 string, > c3 string > ) > STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' > WITH SERDEPROPERTIES ("hbase.columns.mapping" = > ":key#s,test:c1#s,test:c2#s,test:c3#s") > TBLPROPERTIES("hbase.table.name" = "hive_hbase_test"); > hive> select * from hive_hbase_test; > OK > 1 c1-1 c2-1 c3-1 > 2 c1-2 NULL NULL > hive> select c1 from hive_hbase_test; > c1-1 > c1-2 > hive> select c1, c2 from hive_hbase_test; > c1-1 c2-1 > c1-2 NULL > {noformat} > So far everything is correct but now: > {noformat} > hive> select c1, c2, c2 from hive_hbase_test; > c1-1 c2-1 c2-1 > c1-2 NULL c2-1 > {noformat} > Selecting c2 twice works the first time but the second time we > actually get the value from the previous row. > {noformat} > hive> select c1, c3, c2, c2, c3, c3, c1 from hive_hbase_test; > c1-1 c3-1 c2-1 c2-1 c3-1 c3-1 c1-1 > c1-2 NULL NULL c2-1 c3-1 c3-1 c1-2 > {noformat} > We've narrowed this down to an early initialization of > {{fieldsInited\[fieldID] = true}} in {{LazyHBaseRow#uncheckedGetField}} and > we'll try to provide a patch which surely needs review. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira