Blizzara commented on issue #15069: URL: https://github.com/apache/datafusion/issues/15069#issuecomment-2900664889
This is a cool idea! I wanted to see what the results look like, so I tried running `cargo test --test sqllogictests -- --substrait-round-trip aggregate` on your branch. It finds 9 errors in `aggregate_skip_partial.slt` and 136 in `aggregate.slt`. I modified the sqllogic binary to print out all the errors, and counted the unique ones. It seems that lots of them are shared: ``` cargo test --test sqllogictests -- --substrait-round-trip aggregate > aggregate.log cat aggregate.log|grep "query failed: "| cut -f 2- -d " " | sort | uniq -c 7 query failed: DataFusion error: Error during planning: No table named 'tmp_table' 20 query failed: DataFusion error: Internal error: Null cast is not valid. 3 query failed: DataFusion error: Substrait error: Only literal types and aliases are supported in Virtual Tables, got: Cast 16 query failed: DataFusion error: Substrait error: Only literal types and aliases are supported in Virtual Tables, got: ScalarFunction 6 query failed: DataFusion error: This feature is not implemented: GroupingSet CUBE is not yet supported 32 query failed: DataFusion error: This feature is not implemented: Producing a row from empty relation is unsupported 2 query failed: DataFusion error: This feature is not implemented: Unsupported cast type: Dictionary(Int32, Utf8) 15 query failed: DataFusion error: This feature is not implemented: Unsupported cast type: Dictionary(Int64, Int32) 1 query failed: DataFusion error: This feature is not implemented: Unsupported cast type: Duration(Millisecond) 12 query failed: DataFusion error: This feature is not implemented: Unsupported cast type: Duration(Second) 11 query failed: DataFusion error: This feature is not implemented: Unsupported cast type: Time64(Nanosecond) ``` These seem like they would be fixable pretty easily. There are also some more concerning ones, namely 20 cases where the conversion passes but the result seems to be "wrong". Out of those, I can see couple patterns, there's some benign ones: - 3x different temporary aliasing of columns failing EXPLAINs, but would result in correct result - 5x slightly different plan otherwise, but would (?) result in correct result (extra Projection, swap Projection and Sort, duplicate Aggregate) And some real issues: - 9x wrong result due to not respecting IGNORE NULLS in array_agg/first_value/last_value - 3x wrong result otherwise (range window, rollup) It would be great to have these tests run somehow in CI, not failing it, but showing the failures somehow.. and then file issues for each type of failure, and then fix them 😄 Seems like it'd probably be possible to fix a big bunch of the errors by fixing just couple issues, so I think that'd speak for the option 2 from above: > Don't enforce Substrait validation immediately, and do some rounds solving the bugs that contribute the most errors in order to lower that number, and then do 1. (= add `skipif`s) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org