[jira] [Commented] (IGNITE-4035) SQL: Avoid excessive calls of deterministic functions on same arguments

Sergey Kalashnikov (JIRA) Tue, 07 Feb 2017 09:18:12 -0800

    [ 
https://issues.apache.org/jira/browse/IGNITE-4035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15856353#comment-15856353
 ]


Sergey Kalashnikov commented on IGNITE-4035:
--------------------------------------------

To my mind, the best way of addressing this is to introduce the optimization 
into H2 itself.

As for Ignite:

1) We can optimize it by replacing specific nodes (subtrees actually) in the 
expression tree 
of H2 prepared sql statement with references to equivalent nodes that we have 
already visited and cached.
But currently the prepared statement tree is only traversed during query split 
to form map/reduce queries.
In case of local queries, introducing expression tree traversal seems too much 
of overhead for the sake of single optimization.
In case of two step query, the optimized expression tree will be transformed 
back to sql text in order to be sent 
to server node for execution and this will effectively erase the aforementioned 
optimization.

2) Alternatively, we can consider transforming the initial query by adding 
subquery as follows:
From:
SELECT AVG(datediff('s',ts1,ts2)), MIN(datediff('s',ts1,ts2)), 
MAX(datediff('s',ts1,ts2)) FROM TABLE
To:
SELECT AVG(Y), MIN(Y), MAX(Y) FROM (SELECT DATEDIFF('s', ts1, ts2) AS Y FROM 
TABLE)

This seems feasible for the case of TwoStepQuery since tree traversal is 
already there. 
Still such logic seems heavier than theoretical fix in H2.

> SQL: Avoid excessive calls of deterministic functions on same arguments
> -----------------------------------------------------------------------
>
>                 Key: IGNITE-4035
>                 URL: https://issues.apache.org/jira/browse/IGNITE-4035
>             Project: Ignite
>          Issue Type: Task
>          Components: SQL
>    Affects Versions: 1.6, 1.7
>            Reporter: Andrew Mashenkov
>            Assignee: Sergey Kalashnikov
>              Labels: performance
>             Fix For: 2.0
>
>
> In sql query example below, heavy "datediff" deterministic function will be 
> called 4 times per row. I'd expected function was called once per row. 
> Example:
> {noformat}
> Select
>   avg(datediff('s',ts1,ts2)) as avg_diff,
>   min(datediff('s',ts1,ts2)) as min_diff,
>   max(datediff('s',ts1,ts2)) as max_diff
> From table
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (IGNITE-4035) SQL: Avoid excessive calls of deterministic functions on same arguments

Reply via email to