Re: New SQL execution engine

Andrey Mashenkov Fri, 27 Sep 2019 08:08:49 -0700

Nikolay, Igor.

Implementing from scratch is an option, of course.
If we decide to go this way then we definitely won't to spend long nights
to invent "yet another SQL parser" with all the stuff related to query
rewrite rules (e.g. IN -> JOIN) or type casting \ validation \ conversion.


We thought about step-by-step H2 replacing.
1. We've tried to make POC with parser replacement to generated one from
SQL grammar with ASM,
but this approach looks slow, AFAIR. Gridgainers, anybody, have smth on
this?

2. Then we need a planner with all the rules.
Of course we will need to write rules optimized for "Distributed" execution
in anyway, but I doubt anybody want to write common-rules that already has
Calcite.
We can copy-paste, but what for?

3. Then we have to implement execution pipeline.
Possibly, we can adopt new query plans for H2 execution, but then we will
still have same pain with resolving H2 internal issues (e.g. OOM).
H2 approach is outdated, it doesn't fit Ignite needs as distributes system.

With Calcite we can concentrate on 2 and (mostly) 3 points and reuse
their architectural abstracts, otherwise we should reinvent those abstracts
through long discussions on dev-list.

I agree, we should make IEP clear to everyone in community who want to be
involved in IEP implementation at first.
Both approaches ("from scratch" and  "with Calcite") are risky, so

Can we try to make an additional engine "beta"-implementation and allow
users fallback to old engine until a new one will be decided to become
mature enough.




On Fri, Sep 27, 2019 at 5:08 PM Seliverstov Igor <gvvinbl...@gmail.com>
wrote:

> Nikolay,
>
> The main issue - there is no *selection*.
>
> There is a field of knowledge - relational algebra, which describes how to
> transform relational expressions saving their semantics, and a couple of
> implementations (Calcite is only one written in Java).
>
> There are only two alternatives:
>
> 1) Implementing white papers from scratch
> 2) Adopting Calcite to our needs.
>
> The second way was chosen by several other projects, there is experience,
> there is a list of known issues (like using indexes) so, almost everything
> is already done for us.
>
> Implementing a planner is a big deal, I think anybody understands it
> there. That's why our proposal to reuse others experience is obvious.
>
> If you have an alternative - you're welcome, I'll gratefully listen to you.
>
> The main question isn't "WHAT" but "HOW" - that's the discussion topic
> from my point of view.
>
> Regards,
> Igor
>
> > 27 сент. 2019 г., в 16:37, Nikolay Izhikov <nizhi...@apache.org>
> написал(а):
> >
> > Roman.
> >
> >> Nikolay, Maxim, I understand that our arguments may not be as obvious
> >> for you as it obvious for SQL team. So, please arrange your questions
> in
> >> a more constructive way.
> >
> > What is SQL team?
> > I only know Ignite community :)
> >
> > Please, share you knowledge in IEP.
> > I want to join to the process of engine *selection*.
> > It should start with the requirements to such engine.
> > Can you write it in IEP, please?
> >
> > My point is very simple:
> >
> > 1. We made the wrong decision with H2
> > 2. We should make a well-thought decision about the new engine.
> >
> >> How many tickets would satisfy you?
> >
> > You write about "issueS" with the H2.
> > All I see is one open ticket.
> > IEP doesn't provide enough information.
> > So it's not about the number of tickets, it's about
> >
> >> These two points (single map-reduce execution and inflexible optimizer)
> >> are the main problems with the current engine.
> >
> > We may come to the point when Calcite(or any other engine) brings us
> third and other "main problems".
> > This is how it happens with H2.
> >
> > Let's start from what we want to get with the engine and move forward
> from this base.
> > What do you think?
> >
> >
> >
> > В Пт, 27/09/2019 в 16:15 +0300, Roman Kondakov пишет:
> >> Maxim, Nikolay,
> >>
> >> I've listed two issues which show the ideological flaws of the current
> >> engine.
> >>
> >> 1. IGNITE-11448 - Open. This ticket describes the impossibility of
> >> executing queries which can not be fit in the hardcoded one pass
> >> map-reduce paradigm.
> >>
> >> 2. IGNITE-6085 - Closed (won't fix) - This ticket describes the second
> >> major problem with the current engine: H2 query optimizer is very
> >> primitive and can not perform many useful optimizations.
> >>
> >> These two points (single map-reduce execution and inflexible optimizer)
> >> are the main problems with the current engine. It means that our engine
> >> is currently  suitable for execution only a very limited subset of the
> >> typical SQL queries. For example it can not even run most of the TPC-H
> >> benchmark queries because they don't fit to the simple map-reduce
> paradigm.
> >>
> >>> All I see is links to two tickets:
> >>
> >> How many tickets would satisfy you? I named two. And it looks like it
> is
> >> not enough from your point of view. Ok, so how many is enough? The set
> >> of problems caused by listed above tickets is infinite, therefore I can
> >> not create a ticket for each of them.
> >>> Tech details also should be added.
> >>
> >> Tech details are in the tickets.
> >>
> >>> We can't discuss such a huge change as an execution engine replacement
> with descrition like:
> >>> "No data co-location control, i.e. arbitrary data can be returned
> silently" or
> >>> "Low control on how query executes internally, as a result we have
> limited possibility to implement improvements/fixes."
> >>
> >> Why not? Don't you understand these problems? Or you don't think this
> is
> >> a problem?
> >>
> >>> Let's make these descriptions more specific.
> >>
> >> What do you mean by "more specific"? What is the criteria of the
> >> specific description?
> >>
> >>
> >>
> >> Nikolay, Maxim, I understand that our arguments may not be as obvious
> >> for you as it obvious for SQL team. So, please arrange your questions
> in
> >> a more constructive way.
> >>
> >> Thank you!
>
>

-- 
Best regards,
Andrey V. Mashenkov

Re: New SQL execution engine

Reply via email to