[ https://issues.apache.org/jira/browse/HIVE-12844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
david updated HIVE-12844: ------------------------- Description: in hbase 1.0.2,I created a table 'test1',it has below rows and values: hbase(main):027:0> scan 'test1' ROW COLUMN+CELL a1 column=df1:a2, timestamp=1452505991743, value=ddd a1 column=df1:a3, timestamp=1452506082723, value=eee a1 column=df1:c2, timestamp=1452505705391, value=bbb b1 column=df1:a2, timestamp=1452505838737, value=ccc b1 column=df1:a3, timestamp=1452506149461, value=fff r1 column=df1:a, timestamp=1452507261849, value=hhh r1 column=df1:a1, timestamp=1452507100774, value=ggg r1 column=df1:c1, timestamp=1451221711588, value=aaa then I created hive-1.2.1 table: create external table test3( key string, coll string, col2 string, col3 string, col4 string, col5 string, col6 string) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,df1:a,df1:1,df1:a2,df1:a3,df1:c1,df1:c2") TBLPROPERTIES("hbase.table.name" = "test1"); when I run query in hive: hive> select * from test3; OK a1 NULL NULL ddd eee NULL bbb b1 NULL NULL ccc fff NULL NULL r1 hhh NULL NULL NULL aaa NULL the result is correct,but when I run: select count(1) from test3; Total MapReduce CPU Time Spent: 6 seconds 770 msec OK 1 it returns "1",I find that it doesn't count the rows where the first column is null, Could you help to analyze this? by the way the hadoop version is 2.6.0 was: in hbase 1.0.2,I created a table 'test1',it has below rows and values: hbase(main):027:0> scan 'test1' ROW COLUMN+CELL a1 column=df1:a2, timestamp=1452505991743, value=ddd a1 column=df1:a3, timestamp=1452506082723, value=eee a1 column=df1:c2, timestamp=1452505705391, value=bbb b1 column=df1:a2, timestamp=1452505838737, value=ccc b1 column=df1:a3, timestamp=1452506149461, value=fff r1 column=df1:a, timestamp=1452507261849, value=hhh r1 column=df1:a1, timestamp=1452507100774, value=ggg r1 column=df1:c1, timestamp=1451221711588, value=aaa then I created hive-1.2.1 table: create external table test3( key string, coll string, col2 string, col3 string, col4 string, col5 string, col6 string) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,df1:a,df1:1,df1:a2,df1:a3,df1:c1,df1:c2") TBLPROPERTIES("hbase.table.name" = "test1"); when I run query in hive: hive> select * from test3; OK a1 NULL NULL ddd eee NULL bbb b1 NULL NULL ccc fff NULL NULL r1 hhh NULL NULL NULL aaa NULL the result is correct,but when I run: select count(1) from test3; Total MapReduce CPU Time Spent: 6 seconds 770 msec OK 1 it returns "1",I find that it doesn't count the rows where the first column is null, Could you help to analyze this? by the way the hadoop version is 2.6.0 > hive-1.2.1 doesn't return correct value when run select count query > ------------------------------------------------------------------- > > Key: HIVE-12844 > URL: https://issues.apache.org/jira/browse/HIVE-12844 > Project: Hive > Issue Type: Bug > Components: Hive > Affects Versions: 1.2.1 > Reporter: david > Priority: Critical > > in hbase 1.0.2,I created a table 'test1',it has below rows and values: > hbase(main):027:0> scan 'test1' > ROW COLUMN+CELL > > > a1 column=df1:a2, > timestamp=1452505991743, value=ddd > > a1 column=df1:a3, > timestamp=1452506082723, value=eee > > a1 column=df1:c2, > timestamp=1452505705391, value=bbb > > b1 column=df1:a2, > timestamp=1452505838737, value=ccc > > b1 column=df1:a3, > timestamp=1452506149461, value=fff > > r1 column=df1:a, > timestamp=1452507261849, value=hhh > > r1 column=df1:a1, > timestamp=1452507100774, value=ggg > > r1 column=df1:c1, > timestamp=1451221711588, value=aaa > then I created hive-1.2.1 table: > create external table test3( > key string, > coll string, > col2 string, > col3 string, > col4 string, > col5 string, > col6 string) > STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' > WITH SERDEPROPERTIES > ("hbase.columns.mapping" = > ":key,df1:a,df1:1,df1:a2,df1:a3,df1:c1,df1:c2") > TBLPROPERTIES("hbase.table.name" = "test1"); > when I run query in hive: > hive> select * from test3; > OK > a1 NULL NULL ddd eee NULL bbb > b1 NULL NULL ccc fff NULL NULL > r1 hhh NULL NULL NULL aaa NULL > the result is correct,but when I run: > select count(1) from test3; > Total MapReduce CPU Time Spent: 6 seconds 770 msec > OK > 1 > it returns "1",I find that it doesn't count the rows where the first column > is null, > Could you help to analyze this? > by the way the hadoop version is 2.6.0 -- This message was sent by Atlassian JIRA (v6.3.4#6332)