For me that would be great. I'm going to start reading the code and see
what I can write to the arrow guide I'm working on.

Thanks

On Thu, 4 Feb 2021, 11:28 Andrew Lamb, <al...@influxdata.com> wrote:

> H Fernando, yes I would be delighted.
>
> I am planning on creating a high level overview w/ slides as a Tech Talk
> (for work, but will be open to the public) sometime in March. How about I
> pull together some initial material, and then I can share that / go over it
> with anyone who is interested?
>
> What do you think?
>
> Andrew
>
>
> On Thu, Feb 4, 2021 at 3:51 AM Fernando Herrera <
> fernando.j.herr...@gmail.com> wrote:
>
> > Hi Andrew,
> > I would like to work a little bit more on Datafusion, so I was wondering
> if
> > you could give a small walkthrough of the code and how the queries are
> > constructed. Do you think that could be possible?
> >
> > Fernando
> >
> > On Wed, Feb 3, 2021 at 11:13 PM Andrew Lamb <al...@influxdata.com>
> wrote:
> >
> > > This is awesome, thank you Daniel. I agree that focusing on enough SQL
> > for
> > > TPCH queries would be a great idea and way to focus our efforts.
> > >
> > > Subqueries may be the largest remaining outstanding item that I see --
> I
> > > have some ideas of how to implement them on the planner side if others
> > are
> > > interested in collaborating.
> > >
> > > Andrew
> > >
> > > On Wed, Feb 3, 2021 at 4:02 PM Andy Grove <andygrov...@gmail.com>
> wrote:
> > >
> > > > Thanks for the update on this, Daniël. It is great to see the
> progress
> > > with
> > > > this!
> > > >
> > > > Perhaps it is worth creating one JIRA issue per failing query
> detailing
> > > the
> > > > errors and we can link these to the issues that are causing the
> > failures?
> > > >
> > > > On Wed, Feb 3, 2021 at 1:57 PM Mike Seddon <seddo...@gmail.com>
> wrote:
> > > >
> > > > > Hi Daniël,
> > > > >
> > > > > I am working on 22 as part of
> > > https://github.com/apache/arrow/pull/9243
> > > > >
> > > > > We also need to convert all the Float64 schema types to Decimal(n).
> > > > >
> > > > > Cheers,
> > > > > Mike
> > > > >
> > > > > On Thu, Feb 4, 2021 at 5:44 AM Daniël Heres <danielhe...@gmail.com
> >
> > > > wrote:
> > > > >
> > > > > > Hey all,
> > > > > >
> > > > > > Quite some features have been added to DataFusion in the last
> > couple
> > > of
> > > > > > months.
> > > > > >
> > > > > > One test of the functionality we support this is the TPC-H
> > benchmark.
> > > > We
> > > > > > now can run 7 out of 22 queries without errors.
> > > > > > I think a nice goal would be having complete support for the full
> > > > suite,
> > > > > as
> > > > > > it means a lot of functionality is included, helps optimization
> and
> > > > helps
> > > > > > us to test against other engines.
> > > > > >
> > > > > > These queries fail currently because of missing features or bugs:
> > > > > >
> > > > > > * 2 IN (Subquery) in (WHERE) expression
> > > > > > * 4 Intervals https://github.com/apache/arrow/pull/9373
> > > > > > * 7 Fails with error "Schema contains duplicate unqualified field
> > > name
> > > > > > \'n_nationkey\'"
> https://issues.apache.org/jira/browse/ARROW-11432
> > > > > > * 8 Fails with error "Schema contains duplicate unqualified field
> > > name
> > > > > > \'n_nationkey\'"
> https://issues.apache.org/jira/browse/ARROW-11432
> > > > > > * 9 Fails with error "Cartesian joins are not supported"
> > > > > > * 11 HAVING support  https://github.com/apache/arrow/pull/9364
> > (but
> > > > also
> > > > > > requires IN (subquery) in expression)
> > > > > > * 13 Filters in JOIN condition: "Unsupported expression
> \'NotLike\'
> > > in
> > > > > JOIN
> > > > > > condition"
> > > > > > * 14 CASE WHEN expressions are not coerced yet, query fails with
> > > error
> > > > > > "false_values downcast failed"
> > > > > > * 15 VIEW/multiple statement support: "The context currently only
> > > > > supports
> > > > > > a single SQL statement"
> > > > > > * 16 IN (Subquery) in (WHERE) expression
> > > > > > * 17 Subquery in (WHERE) expression
> > > > > > * 18 IN (Subquery) in (WHERE) expression
> > > > > > * 19 Fails with error "Cartesian joins are not supported"
> > > > > > * 20 IN (Subquery) in (WHERE) expression
> > > > > > * 21 Compound identifier not supported: "Unsupported compound
> > > > identifier
> > > > > > \'[\"l1\", \"l_suppkey\"]"
> > > > > > * 22 Fails with parser error for the syntax SUBSTRING(col FROM 1)
> > > > > >
> > > > > > FOR 2)
> > > > > > Other functionality not causing failures now, but needed:
> > > > > >
> > > > > > * EXTRACT https://github.com/apache/arrow/pull/9359
> > > > > > * EXISTS
> > > > > >
> > > > > > Am I missing any JIRA issues / PRs or features in this list? I
> > would
> > > > like
> > > > > > to create some issues on JIRA so we can tackle this.
> > > > > >
> > > > > > Best regards,
> > > > > >
> > > > > > Daniël
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to