[
https://issues.apache.org/jira/browse/KUDU-2038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16052063#comment-16052063
]
Andrew Wong commented on KUDU-2038:
-----------------------------------
I think the above placement (storing the index as a part of the base data)
makes sense.
Applying mutations after reading might be tricky though. Currently, if there
are mutations, 1) we read a block of rows from disk, 2) then apply deltas to
that block, and 3) finally evaluate the predicate against each row. The benefit
with a bitmap index is that we can avoid 1) and do 3) without reading the row
data into memory, and at the moment, 2) _must_ come before 3). Invalidating the
index in the presence of mutations, as Todd mentioned, is probably the simplest
solution.
Another thing to keep in mind is that building the bitmap index isn't trivial
since it essentially requires reading the entire table. If starting from a
brand new cluster, this wouldn't be so bad. If data already exists in the table
and an index is created, there will likely be some time between the initial
call to the CreateBitmapIndex() and the bitmap actually being ready. If the
bitmap index can be specified at tablet creation, this probably isn't an issue.
> Add b-tree or inverted index on value field
> -------------------------------------------
>
> Key: KUDU-2038
> URL: https://issues.apache.org/jira/browse/KUDU-2038
> Project: Kudu
> Issue Type: Wish
> Reporter: Yi Guolei
>
> Do we have a plan to add index on any column [not primary column] ? Currently
> kudu does not have btree or inverted index on columns. In this case if a
> query wants to filter a column then kudu has to scan all datas in all
> rowsets.
> For example, select * from table where salary > 10000 and age < 40, the bloom
> filter or min max index will have no effect, kudu has to scan all datas in
> all row sets. But if kudu has inverted index, then it will be much faster.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)