Hi all, I am facing an issue when selecting a nullable column twice in a hive select statement against an hbase table. For rows where that column is null, 2 different values are returned: one null (correct) and the second is the last non-null value. Has anyone seen this issue?
The query: hive> select visitorId, memberId, dateCreated, memberId from hbase_page_view; The results: 003320da-01bf-4ddc-80e9-5d070389a53d NULL 2013-05-21 15:19:31.781 007be1d9-a93a-4ca5-a5d3-e6047aa5cfd0 NULL 2013-05-21 14:18:14.623 00fe4cc7-e7a9-4351-8f9a-8165547dc4f5 NULL 2013-05-21 15:27:03.628 00fe4cc7-e7a9-4351-8f9a-8165547dc4f5 NULL 2013-05-21 15:27:34.174 00fe4cc7-e7a9-4351-8f9a-8165547dc4f5 78913 2013-05-21 15:28:58.714 78913 00fe4cc7-e7a9-4351-8f9a-8165547dc4f5 78913 2013-05-21 15:29:25.765 78913 01004f8b-8817-4866-84c7-b40e634a41d9 NULL 2013-05-21 14:36:29.405 78913 01619700-11f2-4b88-a9ef-55dc609379ca NULL 2013-05-21 15:56:11.157 78913 01a5036a-7ebc-4f35-9be7-461ef318b4c9 NULL 2013-05-21 14:06:18.014 78913 01d91d8a-bd17-44aa-9604-cf6c8ca3d4f3 NULL 2013-05-21 14:05:02.095 78913 01d91d8a-bd17-44aa-9604-cf6c8ca3d4f3 89464 2013-05-21 14:05:51.820 89464 01d91d8a-bd17-44aa-9604-cf6c8ca3d4f3 89464 2013-05-21 14:06:53.558 89464 01d91d8a-bd17-44aa-9604-cf6c8ca3d4f3 89464 2013-05-21 14:07:23.479 89464 01d91d8a-bd17-44aa-9604-cf6c8ca3d4f3 89464 2013-05-21 14:13:59.841 89464 0207b0bd-f3ca-4b9a-acf1-afc401146195 NULL 2013-05-21 14:16:36.733 89464 0207b0bd-f3ca-4b9a-acf1-afc401146195 NULL 2013-05-21 14:28:12.305 89464 0526fae9-310a-4fbc-b2e9-4ecb5a4d21e2 NULL 2013-05-21 14:20:08.159 89464 0526fae9-310a-4fbc-b2e9-4ecb5a4d21e2 NULL 2013-05-21 14:21:08.006 89464 0526fae9-310a-4fbc-b2e9-4ecb5a4d21e2 NULL 2013-05-21 14:21:21.178 89464 0526fae9-310a-4fbc-b2e9-4ecb5a4d21e2 89472 2013-05-21 14:22:05.391 89472 0526fae9-310a-4fbc-b2e9-4ecb5a4d21e2 89472 2013-05-21 14:24:38.619 89472 0526fae9-310a-4fbc-b2e9-4ecb5a4d21e2 89472 2013-05-21 14:31:32.279 89472 0559118e-559a-469e-aeff-90254b128fa6 NULL 2013-05-21 14:02:17.084 89472 0559118e-559a-469e-aeff-90254b128fa6 NULL 2013-05-21 14:02:31.437 89472 0559118e-559a-469e-aeff-90254b128fa6 NULL 2013-05-21 14:03:43.456 89472 05bc7517-ceac-40f7-81c9-6da6d1c9713b NULL 2013-05-21 15:57:49.375 89472 0625208e-015a-46be-85b5-ed5102af3d7c NULL 2013-05-21 15:29:29.004 89472 The hbase table has a single column family. It is mapped to an external hive table using the standard hive idiom like so: CREATE EXTERNAL TABLE hbase_page_view(key string, visitorId string, dateCreated string, memberId string, blah blah) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,eve:visid,eve:datec,eve:mid,.....") TBLPROPERTIES ("hbase.table.name" = "page_view"); thanks Rupinder This email is intended for the person(s) to whom it is addressed and may contain information that is PRIVILEGED or CONFIDENTIAL. Any unauthorized use, distribution, copying, or disclosure by any person other than the addressee(s) is strictly prohibited. If you have received this email in error, please notify the sender immediately by return email and delete the message and any attachments from your system.