AdamGS commented on issue #16452: URL: https://github.com/apache/datafusion/issues/16452#issuecomment-2988819800
Some more findings: 1. `datafusion.optimizer.enable_dynamic_filter_pushdown` doesn't seem to make a difference 2. Played around with the seeds, seems like the only one that's important to reproduce the issue is `query_seed`, it doesn't reproduce every time with it but changing it seems to make it to not reproduce in a reasonable time. The query it generates is: ```sql SELECT * FROM sort_fuzz_table ORDER BY interval_month_day_nano DESC LIMIT 3 ``` I also took a deeper look at the actual data and noticed some surprising things that might explain the failure: 1. The top values in the column the query sorts on are often `null`, which makes me think there's a sort stability issue here (the implementation of `check_equality_of_batches` also points in that direction). 2. A lot of the value in the table isn't really valid? when displayed we get a lot of values that are conversion errors of numbers into all kind of temporal types, like `Cast error: Failed to convert -6727098022243200000 to temporal for Date64` I wonder if what's happening all comes down to an unstable sort + when running on a multithreaded runtime events interleave in different ways which result in different overall outcomes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org