I have no strong opinion since I am not using the adapter.

However, I have a small question regarding option 2.

Is the query still going to be inefficient if the user adds the IS NOT NULL
predicate?
If yes, then I probably option 1 would be the best option for the moment.
Otherwise, I don't find it
very problematic to expect the users to right their queries correclty. The
same happens with other database
(e.g., Oracle) where sometimes you need to introduce NULL checks in the
queries in order to use the
existing indexes and have efficient queries.

Best,
Stamatis


Στις Πέμ, 13 Δεκ 2018 στις 4:40 μ.μ., ο/η Andrei Sereda <[email protected]>
έγραψε:

> Thanks, Stamatis.
>
> I agree that sentinels are not the best approach but without pushing
> aggregations to ES they're not very usable to elastic users.
>
> So there are two bad choices:
> 1) (Low?) Probability of collisions triggering invalid results (surprises)
> but efficient query.
> 2)  Inefficient query (fetching whole index in memory)
>
> I hope we can drop support for ES 2.x soon and enforce ES >= 6.1. If so,
> one can use  Composite aggregations
> <
> https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-composite-aggregation.html
> >
> which don't have these shortcomings.
> What we can do now is have separate aggregation query for ES2 and ES6.
>
> What do you think?
>
> Regards,
> Andrei.
>
>
>
> On Thu, Dec 13, 2018 at 6:16 AM Stamatis Zampetakis <[email protected]>
> wrote:
>
> > Hi Andrei,
> >
> > I haven't gone entirely over the new PR, but I think there are cases
> where
> > the result of the queries are going to be wrong (when values collide with
> > sentinels).
> > Another approach would be to introduce a rule in order to push the
> > aggregation in ElasticSearch *only* if the field in questions is *not*
> > nullable. This can make
> > queries a bit more verbose but it guarantees that there will be no
> suprises
> > to the end-user.
> >
> > select max(amount), date from orders group by amount, date -> Aggregation
> > not pushed on ES
> > select max(amount), date from orders where amount is not null and date is
> > not null group by amount, date -> Aggregation pushed on ES
> >
> > Sorry for the late reply!
> >
> > Best,
> > Stamatis
> >
> >
> > Στις Κυρ, 2 Δεκ 2018 στις 2:01 π.μ., ο/η Julian Hyde <[email protected]>
> > έγραψε:
> >
> > > Interesting.
> > >
> > > One of the benefits that a SQL layer such as Calcite can bring is that
> it
> > > hides the details necessary to make operations like this work.
> > >
> > > Julian
> > >
> > >
> > > > On Dec 1, 2018, at 5:22 AM, Kevin Risden <[email protected]> wrote:
> > > >
> > > > I haven't had a chance to review but saw that Elastic has the same
> > issue
> > > > with aggregations.
> > > >
> > > > https://github.com/elastic/elasticsearch/issues/35745
> > > >
> > > > Kevin Risden
> > > >
> > > > On Wed, Nov 28, 2018, 20:46 Andrei Sereda <[email protected] wrote:
> > > >
> > > >> Greetings ,
> > > >> We have discovered an issue with ES aggregations when grouping on
> > > >> non-textual fields (date, long). Currently the following sql fails
> > > because
> > > >> for missing value
> > > >> <
> > > >>
> > >
> >
> https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html#_missing_value_13
> > > >>>
> > > >> we inject __MISSING__ sentinel which is not date / number parseable
> > (it
> > > >> can’t be null either) :
> > > >>
> > > >> select max(amount), date from orders group by date -- special ES
> type
> > > >>
> > > >> The solution is to make sentinel type specific :
> > > >>
> > > >>   1. Integer.MIN_VALUE for integers
> > > >>   2. 9999-12-31 for dates etc.
> > > >>
> > > >> For low cardinality types like boolean, byte, short I’m not sure
> what
> > > to do
> > > >> since there is high probability of collision between missing field
> and
> > > >> actual value (eg. what value to choose for missing boolean?) :
> > > >>
> > > >> select max(amount), isActive from orders group by isActive --
> boolean
> > > type
> > > >>
> > > >> Let me know if you solved this problem differently before. Composite
> > > >> aggregations
> > > >> <
> > > >>
> > >
> >
> https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-composite-aggregation.html
> > > >>>
> > > >> (available since 6.1) should help in future.
> > > >>
> > > >> PR: https://github.com/apache/calcite/pull/946
> > > >> JIRA: https://issues.apache.org/jira/browse/CALCITE-2689
> > > >>
> > > >> Many Thanks in Advance,
> > > >> Andrei.
> > > >>
> > >
> > >
> >
>

Reply via email to