erenavsarogullari opened a new pull request, #20484:
URL: https://github.com/apache/datafusion/pull/20484
## Which issue does this PR close?
- Closes #20483.
## Rationale for this change
Currently, Spark `shuffle` function returns following error message when
`seed` is `null`. This needs to be fixed by exposing `NULL` instead of
`'Int64'`.
**Current:**
```
query error
SELECT shuffle([2, 1], NULL);
----
DataFusion error: Execution error: shuffle seed must be Int64 type, got
'Int64'
```
**New:**
```
query error DataFusion error: Execution error: shuffle seed must be Int64
type but got 'NULL'
SELECT shuffle([1, 2, 3], NULL);
```
In addition to this fix, this PR also introduces following refactoring to
`shuffle` function:
- Combining args validation checks with `single` error message,
- Extending current error message with expected data types:
```
Current:
shuffle does not support type '{array_type}'.
New:
shuffle does not support type '{array_type}'. Expected types: List,
LargeList, FixedSizeList or Null."
```
- Adding new UT coverages for both `shuffle.rs` and `shuffle.slt`.
## What changes are included in this PR?
<!--
There is no need to duplicate the description in the issue here but it is
sometimes worth providing a summary of the individual changes in this PR.
-->
## Are these changes tested?
Yes, being added new UT cases.
## Are there any user-facing changes?
Yes, updating Spark `shuffle` functions error messages.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]