Zhenya,

there is nothing in common in implementation of Ignite indexes and
Phoenix indexes. I just borrowed the idea how Phoenix supplies the index
metadata (index name, columns, sorting, etc.) to Calcite optimizer. It's
not about index implementation, it's about metadata handling.


-- 
Kind Regards
Roman Kondakov


On 10.12.2019 16:40, Zhenya Stanilovsky wrote:
> 
> Roman just as fast remark, Phoenix builds their approach on already existing 
> monolith HBase architecture, most cases it`s just a stub for someone who 
> wants use secondary indexes with a base with no native support of it. Don`t 
> think it`s good idea here.
>    
>>
>>
>> ------- Forwarded message -------
>> From: "Roman Kondakov" < kondako...@mail.ru.invalid >
>> To:  dev@ignite.apache.org
>> Cc:
>> Subject: Adding support for Ignite secondary indexes to Apache Calcite
>> planner
>> Date: Tue, 10 Dec 2019 15:55:52 +0300
>>
>> Hi all!
>>
>> As you may know there is an activity on integration of Apache Calcite
>> query optimizer into Ignite codebase is being carried out [1],[2].
>>
>> One of a bunch of problems in this integration is the absence of
>> out-of-the-box support for secondary indexes in Apache Calcite. After
>> some research I came to conclusion that this problem has a couple of
>> workarounds. Let's name them
>> 1. Phoenix-style approach - representing secondary indexes as
>> materialized views which are natively supported by Calcite engine [3]
>> 2. Drill-style approach - pushing filters into the table scans and
>> choose appropriate index for lookups when possible [4]
>>
>> Both these approaches have advantages and disadvantages:
>>
>> Phoenix style pros:
>> - natural way of adding indexes as an alternative source of rows: index
>> can be considered as a kind of sorted materialized view.
>> - possibility of using index sortedness for stream aggregates,
>> deduplication (DISTINCT operator), merge joins, etc.
>> - ability to support other types of indexes (i.e. functional indexes).
>>
>> Phoenix style cons:
>> - polluting optimizer's search space extra table scans hence increasing
>> the planning time.
>>
>> Drill style pros:
>> - easier to implement (although it's questionable).
>> - search space is not inflated.
>>
>> Drill style cons:
>> - missed opportunity to exploit sortedness.
>>
>> There is a good discussion about using both approaches can be found in [5].
>>
>> I made a small sketch [6] in order to demonstrate the applicability of
>> the Phoenix approach to Ignite. Key design concepts are:
>> 1. On creating indexes are registered as tables in Calcite schema. This
>> step is needed for internal Calcite's routines.
>> 2. On planner initialization we register these indexes as materialized
>> views in Calcite's optimizer using VolcanoPlanner#addMaterialization
>> method.
>> 3. Right before the query execution Calcite selects all materialized
>> views (indexes) which can be potentially used in query.
>> 4. During the query optimization indexes are registered by planner as
>> usual TableScans and hence can be chosen by optimizer if they have lower
>> cost.
>>
>> This sketch shows the ability to exploit index sortedness only. So the
>> future work in this direction should be focused on using indexes for
>> fast index lookups. At first glance FilterableTable and
>> FilterTableScanRule are good points to start. We can push Filter into
>> the TableScan and then use FilterableTable for fast index lookups
>> avoiding reading the whole index on TableScan step and then filtering
>> its output on the Filter step.
>>
>> What do you think?
>>
>>
>>
>> [1]
>> http://apache-ignite-developers.2346864.n4.nabble.com/New-SQL-execution-engine-tt43724.html#none
>> [2]
>> https://cwiki.apache.org/confluence/display/IGNITE/IEP-37%3A+New+query+execution+engine
>> [3]  https://issues.apache.org/jira/browse/PHOENIX-2047
>> [4]  https://issues.apache.org/jira/browse/DRILL-6381
>> [5]  https://issues.apache.org/jira/browse/DRILL-3929
>> [6]  https://github.com/apache/ignite/pull/7115 
>  
>  
>  
>  
> 

Reply via email to