[ 
https://issues.apache.org/jira/browse/CALCITE-3531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16980648#comment-16980648
 ] 

Danny Chen commented on CALCITE-3531:
-------------------------------------

I saw that Julian always thought that these functions are deterministic, but 
for the constant reduction, there is a fix to treat them as non deterministic 
in 
https://github.com/apache/calcite/commit/39595df932a66a47fcc21558d1ffe5139297b3ea,
and a fix from me CALCITE-2638.

So, what is the meanings to mark these functions deterministic ? Just as 
[~amaliujia] says, Calcite's QE implements it as non-deterministic, we can not 
do constant reduction for them, we can not catch plans for them, we can not 
remove constant group key from them.

I'm inclined to mark them non-deterministic, if any engines want to implement 
them as deterministic, just implements them in the QE is okey(reuse the context 
time variable), then override the isDeterministic flag for them.

> AggregateProjectPullUpConstantsRule should not remove deterministic function 
> group key if the function is dynamic
> -----------------------------------------------------------------------------------------------------------------
>
>                 Key: CALCITE-3531
>                 URL: https://issues.apache.org/jira/browse/CALCITE-3531
>             Project: Calcite
>          Issue Type: Improvement
>          Components: core
>    Affects Versions: 1.21.0
>            Reporter: Danny Chen
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.22.0
>
>          Time Spent: 2h
>  Remaining Estimate: 0h
>
> Now AggregateProjectPullUpConstantsRule simplify the query:
> {code:sql}
> select hiredate
> from sales.emp
> where sal is null and hiredate = current_timestamp
> group by sal, hiredate
> having count(*) > 3
> {code}
> from plan:
> {code:xml}
> LogicalProject(HIREDATE=[$1])
>   LogicalFilter(condition=[>($2, 3)])
>     LogicalAggregate(group=[{0, 1}], agg#0=[COUNT()])
>       LogicalProject(SAL=[$5], HIREDATE=[$4])
>         LogicalFilter(condition=[AND(IS NULL($5), =($4, CURRENT_TIMESTAMP))])
>           LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> {code}
> to plan:
> {code:xml}
> LogicalProject(HIREDATE=[$1])
>   LogicalFilter(condition=[>($2, 3)])
>     LogicalProject(SAL=[$0], HIREDATE=[CURRENT_TIMESTAMP], $f2=[$1])
>       LogicalAggregate(group=[{0}], agg#0=[COUNT()])
>         LogicalProject(SAL=[$5], HIREDATE=[$4])
>           LogicalFilter(condition=[AND(IS NULL($5), =($4, 
> CURRENT_TIMESTAMP))])
>             LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> {code}
> which is unsafe, because for stream sql, we need to group data by dateTime, 
> also the result is wrong if a batch job runs across days.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to