Hi Julian,

Thanks again for your feedback.

Since they are duplicate-insensitive, they should also be splittable
(SqlSplittableAggFunction), just like min/max, etc.
What do you think?

I want to fire a JIRA accordingly, so that more optimizations can be
applied.
Any feedback is appreciated.

Best,
Liya Fan



On Wed, Oct 14, 2020 at 2:59 AM Julian Hyde <[email protected]> wrote:

> I agree. ANY_VALUE and SINGLE_VALUE are duplicate-insensitive.
>
> > On Oct 13, 2020, at 2:17 AM, Fan Liya <[email protected]> wrote:
> >
> > Hi Julian,
> >
> > Thanks a lot for your feedback.
> > I think SqlAggFunction.getDistinctOptionality() is exactly what I
> > am looking for.
> >
> > BTW, I think ANY_VALUE and SINGLE_VALUE also belong to the category of
> > duplicate insensitive functions.
> > What do you think?
> >
> > Best,
> > Liya Fan
> >
> >
> >
> > On Tue, Oct 13, 2020 at 4:55 PM Julian Hyde <[email protected]>
> wrote:
> >
> >> We already have this concept. See
> SqlAggFunction.getDistinctOptionality(),
> >> added in https://issues.apache.org/jira/browse/CALCITE-3159 <
> >> https://issues.apache.org/jira/browse/CALCITE-3159>.
> >>
> >> Julian
> >>
> >>
> >>> On Oct 13, 2020, at 12:54 AM, Fan Liya <[email protected]> wrote:
> >>>
> >>> Hi all,
> >>>
> >>> I would like to introduce the idea of duplicate insensitive aggregate
> >>> functions.
> >>>
> >>> For such functions, the aggregation results remain the same even after
> >>> deduplication.
> >>>
> >>> For example, given a sequence of data {1, 1, 2, 2, 3, 5, 5}, the
> >>> aggregation results of MIN are the same regardless of whether we
> perform
> >>> data deduplication first. That is,
> >>>
> >>> MIN({1, 1, 2, 2, 3, 5, 5}) = MIN({1, 2, 3, 5})
> >>>
> >>> So MIN is a *deduplicate insensitive function*.
> >>>
> >>> On the other hand, function SUM is not duplicate insensitive, because
> >>>
> >>> SUM({1, 1, 2, 2, 3, 5, 5}) != SUM({1, 2, 3, 5})
> >>>
> >>> The concept of deduplicate insensitiveness can help us in many
> >> optimization
> >>> scenarios.
> >>>
> >>> For example, the curent implementation of AggregateMergeRule rules out
> >> any
> >>> aggregate calls for which the isDistict() method returns true. However,
> >> for
> >>> duplicate insensitive functions, the rule should be applicable.
> >>>
> >>> Could you please give your valuable feedback?
> >>>
> >>> Best,
> >>> Liya Fan
> >>
> >>
>
>

Reply via email to