Thanks all for your input! I will create an umbrella ticket + linked failures / issues to track progress for TPCH support coming days and will share it here.
Daniël Op do 4 feb. 2021 om 00:13 schreef Andrew Lamb <al...@influxdata.com>: > This is awesome, thank you Daniel. I agree that focusing on enough SQL for > TPCH queries would be a great idea and way to focus our efforts. > > Subqueries may be the largest remaining outstanding item that I see -- I > have some ideas of how to implement them on the planner side if others are > interested in collaborating. > > Andrew > > On Wed, Feb 3, 2021 at 4:02 PM Andy Grove <andygrov...@gmail.com> wrote: > > > Thanks for the update on this, Daniël. It is great to see the progress > with > > this! > > > > Perhaps it is worth creating one JIRA issue per failing query detailing > the > > errors and we can link these to the issues that are causing the failures? > > > > On Wed, Feb 3, 2021 at 1:57 PM Mike Seddon <seddo...@gmail.com> wrote: > > > > > Hi Daniël, > > > > > > I am working on 22 as part of > https://github.com/apache/arrow/pull/9243 > > > > > > We also need to convert all the Float64 schema types to Decimal(n). > > > > > > Cheers, > > > Mike > > > > > > On Thu, Feb 4, 2021 at 5:44 AM Daniël Heres <danielhe...@gmail.com> > > wrote: > > > > > > > Hey all, > > > > > > > > Quite some features have been added to DataFusion in the last couple > of > > > > months. > > > > > > > > One test of the functionality we support this is the TPC-H benchmark. > > We > > > > now can run 7 out of 22 queries without errors. > > > > I think a nice goal would be having complete support for the full > > suite, > > > as > > > > it means a lot of functionality is included, helps optimization and > > helps > > > > us to test against other engines. > > > > > > > > These queries fail currently because of missing features or bugs: > > > > > > > > * 2 IN (Subquery) in (WHERE) expression > > > > * 4 Intervals https://github.com/apache/arrow/pull/9373 > > > > * 7 Fails with error "Schema contains duplicate unqualified field > name > > > > \'n_nationkey\'" https://issues.apache.org/jira/browse/ARROW-11432 > > > > * 8 Fails with error "Schema contains duplicate unqualified field > name > > > > \'n_nationkey\'" https://issues.apache.org/jira/browse/ARROW-11432 > > > > * 9 Fails with error "Cartesian joins are not supported" > > > > * 11 HAVING support https://github.com/apache/arrow/pull/9364 (but > > also > > > > requires IN (subquery) in expression) > > > > * 13 Filters in JOIN condition: "Unsupported expression \'NotLike\' > in > > > JOIN > > > > condition" > > > > * 14 CASE WHEN expressions are not coerced yet, query fails with > error > > > > "false_values downcast failed" > > > > * 15 VIEW/multiple statement support: "The context currently only > > > supports > > > > a single SQL statement" > > > > * 16 IN (Subquery) in (WHERE) expression > > > > * 17 Subquery in (WHERE) expression > > > > * 18 IN (Subquery) in (WHERE) expression > > > > * 19 Fails with error "Cartesian joins are not supported" > > > > * 20 IN (Subquery) in (WHERE) expression > > > > * 21 Compound identifier not supported: "Unsupported compound > > identifier > > > > \'[\"l1\", \"l_suppkey\"]" > > > > * 22 Fails with parser error for the syntax SUBSTRING(col FROM 1) > > > > > > > > FOR 2) > > > > Other functionality not causing failures now, but needed: > > > > > > > > * EXTRACT https://github.com/apache/arrow/pull/9359 > > > > * EXISTS > > > > > > > > Am I missing any JIRA issues / PRs or features in this list? I would > > like > > > > to create some issues on JIRA so we can tackle this. > > > > > > > > Best regards, > > > > > > > > Daniël > > > > > > > > > > -- Daniël Heres