Hey all, Quite some features have been added to DataFusion in the last couple of months.
One test of the functionality we support this is the TPC-H benchmark. We now can run 7 out of 22 queries without errors. I think a nice goal would be having complete support for the full suite, as it means a lot of functionality is included, helps optimization and helps us to test against other engines. These queries fail currently because of missing features or bugs: * 2 IN (Subquery) in (WHERE) expression * 4 Intervals https://github.com/apache/arrow/pull/9373 * 7 Fails with error "Schema contains duplicate unqualified field name \'n_nationkey\'" https://issues.apache.org/jira/browse/ARROW-11432 * 8 Fails with error "Schema contains duplicate unqualified field name \'n_nationkey\'" https://issues.apache.org/jira/browse/ARROW-11432 * 9 Fails with error "Cartesian joins are not supported" * 11 HAVING support https://github.com/apache/arrow/pull/9364 (but also requires IN (subquery) in expression) * 13 Filters in JOIN condition: "Unsupported expression \'NotLike\' in JOIN condition" * 14 CASE WHEN expressions are not coerced yet, query fails with error "false_values downcast failed" * 15 VIEW/multiple statement support: "The context currently only supports a single SQL statement" * 16 IN (Subquery) in (WHERE) expression * 17 Subquery in (WHERE) expression * 18 IN (Subquery) in (WHERE) expression * 19 Fails with error "Cartesian joins are not supported" * 20 IN (Subquery) in (WHERE) expression * 21 Compound identifier not supported: "Unsupported compound identifier \'[\"l1\", \"l_suppkey\"]" * 22 Fails with parser error for the syntax SUBSTRING(col FROM 1) FOR 2) Other functionality not causing failures now, but needed: * EXTRACT https://github.com/apache/arrow/pull/9359 * EXISTS Am I missing any JIRA issues / PRs or features in this list? I would like to create some issues on JIRA so we can tackle this. Best regards, Daniël