Hi all, I am *really* interested in Hive-1634 ( https://issues.apache.org/jira/browse/HIVE-1634). I have just built from Hive trunk using HBase 0.90.4 as the version (e.g. we run cdh3u2).
We have an HBase table populated with Bytes, so I create the Hive table like so: CREATE EXTERNAL TABLE tim_hbase_occurrence ( id int, scientific_name string, data_resource_id int ) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ( "hbase.columns.mapping" = ":key,v:scientific_name,v:data_resource_id", "hbase.columns.storage.types" = "b,b,b" ) TBLPROPERTIES( "hbase.table.name" = "mini_occurrences", "hbase.table.default.storage.type" = "binary" ); This suggests it understands the formats: hive> SELECT * FROM tim_hbase_occurrence LIMIT 3; OK 1444 Abies alba 1081 1445 Abies alba 1081 1446 Abies alba 1081 But doing any queries, suggest not: hive> SELECT * FROM tim_hbase_occurrence WHERE scientific_name='Abies alba' limit 3; ... NULL Abies alba NULL NULL Abies alba NULL NULL Abies alba NULL Time taken: 9.668 seconds hive> SELECT * FROM tim_hbase_occurrence WHERE data_resource_id=1081; ... 0 (no records) Can anyone provide any guidance on this please? Thanks! Tim