[ 
https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13010482#comment-13010482
 ] 

He Yongqiang commented on HIVE-1803:
------------------------------------

Did an offline discussion with namit on this jira. 

The basic question is how to use this bitmap indexing. Given there are millions 
of rows in one block, the block will contain all distinct values this column 
has. So the bitmap index will not be very useful. A possibly use case maybe do 
a bitmap and/or. eg, need to find out all records about Male in Japan. Male and 
Japan are both bitmap indexed. what we can do today is to first do a JOIN and 
BITMAP AND operation on the 2 index tables, and then find all the matching 
blocks, which is ok, but there requires a join operation. If we can support an 
bitmap index with more than 1 index columns, it will help in this case. I mean 
each index column in the index table has its own bitmap. Eg, FILE_NAME, 
BLK_OFFSET, GENDER, bitmapForGENDER, COUNTY, bitmapForCountry. bitmapForGENDER 
will have two bitmaps internally, one for Male, one for Female. And 
bitmapForCountry will have bitmaps for each country.

And if hive can support skip rows, the bitmap index will be very useful. I mean 
with bitmap indexing, block pruning maybe not good enough. For example, in a 
block, we only find the row1, row3, lastRow satisfy the predicate. We can just 
skip row2, and row4 to lastRow-1.


what do you think?

> Implement bitmap indexing in Hive
> ---------------------------------
>
>                 Key: HIVE-1803
>                 URL: https://issues.apache.org/jira/browse/HIVE-1803
>             Project: Hive
>          Issue Type: New Feature
>          Components: Indexing
>            Reporter: Marquis Wang
>            Assignee: Marquis Wang
>         Attachments: HIVE-1803.1.patch, HIVE-1803.2.patch, HIVE-1803.3.patch, 
> HIVE-1803.4.patch, HIVE-1803.5.patch, HIVE-1803.6.patch, HIVE-1803.7.patch, 
> JavaEWAH_20110304.zip, bitmap_index_1.png, bitmap_index_2.png, javaewah.jar, 
> javaewah.jar
>
>
> Implement bitmap index handler to complement compact indexing.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to