Re: [PR] bug: Fix NULL handling in array_slice [datafusion]

2025-01-30 Thread via GitHub
jkosh44 commented on PR #14289: URL: https://github.com/apache/datafusion/pull/14289#issuecomment-2625046192 > > ```rust > > /// Any Null input causes the function to return Null. > > Propogate, > > ``` > > I've updated the PR to use this. I just realized though that windo

Re: [PR] bug: Fix NULL handling in array_slice [datafusion]

2025-01-29 Thread via GitHub
jkosh44 commented on code in PR #14289: URL: https://github.com/apache/datafusion/pull/14289#discussion_r1933573232 ## datafusion/functions-nested/src/extract.rs: ## @@ -330,7 +330,8 @@ pub(super) struct ArraySlice { impl ArraySlice { pub fn new() -> Self { Self {

Re: [PR] bug: Fix NULL handling in array_slice [datafusion]

2025-01-29 Thread via GitHub
jkosh44 commented on code in PR #14289: URL: https://github.com/apache/datafusion/pull/14289#discussion_r1933570686 ## datafusion/physical-expr/src/scalar_function.rs: ## @@ -186,6 +187,15 @@ impl PhysicalExpr for ScalarFunctionExpr { .map(|e| e.evaluate(batch))

Re: [PR] bug: Fix NULL handling in array_slice [datafusion]

2025-01-29 Thread via GitHub
jkosh44 commented on PR #14289: URL: https://github.com/apache/datafusion/pull/14289#issuecomment-2621158643 > ```rust > /// Any Null input causes the function to return Null. > Propogate, > ``` I've updated the PR to use this. I just realized though that window and aggre

Re: [PR] bug: Fix NULL handling in array_slice [datafusion]

2025-01-28 Thread via GitHub
jayzhan211 commented on PR #14289: URL: https://github.com/apache/datafusion/pull/14289#issuecomment-2620376458 ```rust /// How a function handles Null input. enum NullHandling { /// Pass Null inputs into the function implementation. PassThrough, /// Any Null input

Re: [PR] bug: Fix NULL handling in array_slice [datafusion]

2025-01-28 Thread via GitHub
jayzhan211 commented on code in PR #14289: URL: https://github.com/apache/datafusion/pull/14289#discussion_r1933068989 ## datafusion/functions-nested/src/extract.rs: ## @@ -330,7 +330,8 @@ pub(super) struct ArraySlice { impl ArraySlice { pub fn new() -> Self { Sel

Re: [PR] bug: Fix NULL handling in array_slice [datafusion]

2025-01-28 Thread via GitHub
jkosh44 commented on PR #14289: URL: https://github.com/apache/datafusion/pull/14289#issuecomment-2620139202 > I don't like the name strict as it can mean different things (like fail on parsing invalid strings), I think it should be an enum on null handling How about something like

Re: [PR] bug: Fix NULL handling in array_slice [datafusion]

2025-01-28 Thread via GitHub
rluvaton commented on PR #14289: URL: https://github.com/apache/datafusion/pull/14289#issuecomment-2620098993 I don't like the name strict as it can mean different things (like fail on parsing invalid strings), I think it should be an enum on null handling -- This is an automated message

Re: [PR] bug: Fix NULL handling in array_slice [datafusion]

2025-01-26 Thread via GitHub
jkosh44 commented on code in PR #14289: URL: https://github.com/apache/datafusion/pull/14289#discussion_r1929903548 ## datafusion/functions-nested/src/extract.rs: ## @@ -330,7 +330,8 @@ pub(super) struct ArraySlice { impl ArraySlice { pub fn new() -> Self { Self {

Re: [PR] bug: Fix NULL handling in array_slice [datafusion]

2025-01-25 Thread via GitHub
jayzhan211 commented on code in PR #14289: URL: https://github.com/apache/datafusion/pull/14289#discussion_r1929640010 ## datafusion/physical-expr/src/scalar_function.rs: ## @@ -186,6 +186,56 @@ impl PhysicalExpr for ScalarFunctionExpr { .map(|e| e.evaluate(batch))

Re: [PR] bug: Fix NULL handling in array_slice [datafusion]

2025-01-25 Thread via GitHub
jkosh44 commented on code in PR #14289: URL: https://github.com/apache/datafusion/pull/14289#discussion_r1929642070 ## datafusion/functions-nested/src/extract.rs: ## @@ -330,7 +330,8 @@ pub(super) struct ArraySlice { impl ArraySlice { pub fn new() -> Self { Self {

Re: [PR] bug: Fix NULL handling in array_slice [datafusion]

2025-01-25 Thread via GitHub
jkosh44 commented on code in PR #14289: URL: https://github.com/apache/datafusion/pull/14289#discussion_r1929639392 ## datafusion/physical-expr/src/scalar_function.rs: ## @@ -186,6 +186,56 @@ impl PhysicalExpr for ScalarFunctionExpr { .map(|e| e.evaluate(batch))

Re: [PR] bug: Fix NULL handling in array_slice [datafusion]

2025-01-25 Thread via GitHub
jkosh44 commented on code in PR #14289: URL: https://github.com/apache/datafusion/pull/14289#discussion_r1929639226 ## datafusion/physical-expr/src/scalar_function.rs: ## @@ -186,6 +186,56 @@ impl PhysicalExpr for ScalarFunctionExpr { .map(|e| e.evaluate(batch))

Re: [PR] bug: Fix NULL handling in array_slice [datafusion]

2025-01-25 Thread via GitHub
jkosh44 commented on PR #14289: URL: https://github.com/apache/datafusion/pull/14289#issuecomment-2614165389 > This is because the signature for extract doesn't handle type checking correctly, it uses variadic_any, not because of the introduced new trait method Oh, great then I don't

Re: [PR] bug: Fix NULL handling in array_slice [datafusion]

2025-01-25 Thread via GitHub
jayzhan211 commented on PR #14289: URL: https://github.com/apache/datafusion/pull/14289#issuecomment-2614161185 > > Maybe we need yet another trait implementation > > I think it could potentially be modeled as a field on `Signature` https://docs.rs/datafusion/latest/datafusion/logical

Re: [PR] bug: Fix NULL handling in array_slice [datafusion]

2025-01-25 Thread via GitHub
jayzhan211 commented on PR #14289: URL: https://github.com/apache/datafusion/pull/14289#issuecomment-2614158196 > So instead of SELECT array_slice(1.5, NULL, NULL) returning an error for an unsupported type in the first argument, it will return NULL This is because the signature for `

Re: [PR] bug: Fix NULL handling in array_slice [datafusion]

2025-01-25 Thread via GitHub
jkosh44 commented on PR #14289: URL: https://github.com/apache/datafusion/pull/14289#issuecomment-2614042678 I made a couple of changes to this PR in the second commit. Previously, I was getting an optimizer error under certain scenarios, for example, ``` > select array_slice([1,2,3],

Re: [PR] bug: Fix NULL handling in array_slice [datafusion]

2025-01-25 Thread via GitHub
jkosh44 commented on PR #14289: URL: https://github.com/apache/datafusion/pull/14289#issuecomment-2614003106 > Maybe we need yet another trait implementation One complication with this approach is that we might end up skipping over error checks that are inside of the function implemen

Re: [PR] bug: Fix NULL handling in array_slice [datafusion]

2025-01-25 Thread via GitHub
alamb commented on PR #14289: URL: https://github.com/apache/datafusion/pull/14289#issuecomment-2613986282 > Maybe we need yet another trait implementation I think it could potentially be modeled as a field on `Signature` https://docs.rs/datafusion/latest/datafusion/logical_expr/struc

Re: [PR] bug: Fix NULL handling in array_slice [datafusion]

2025-01-25 Thread via GitHub
jayzhan211 commented on PR #14289: URL: https://github.com/apache/datafusion/pull/14289#issuecomment-2613940212 Maybe we need yet another trait implementation ```rust trait ScalarUDFImpl { fn handle_nulls(&self, args: ScalarFunctionArgs) -> Result> { // most of the cas

Re: [PR] bug: Fix NULL handling in array_slice [datafusion]

2025-01-25 Thread via GitHub
alamb commented on PR #14289: URL: https://github.com/apache/datafusion/pull/14289#issuecomment-2613937768 > We could handle such nulls handling in `ScalarFunctionExpr::evaluate` Most SQL functions are "pure" in the sense that if any of their inputs are null they produce output

Re: [PR] bug: Fix NULL handling in array_slice [datafusion]

2025-01-24 Thread via GitHub
jayzhan211 commented on PR #14289: URL: https://github.com/apache/datafusion/pull/14289#issuecomment-2613797870 I think another approach is to introduce optimizer rule that transform any function that contains null to null in ConstEvaluator. Is there any function that doesn't return null bu

Re: [PR] bug: Fix NULL handling in array_slice [datafusion]

2025-01-24 Thread via GitHub
jatin510 commented on PR #14289: URL: https://github.com/apache/datafusion/pull/14289#issuecomment-2613755067 lgtm @alamb -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] bug: Fix NULL handling in array_slice [datafusion]

2025-01-24 Thread via GitHub
jkosh44 commented on PR #14289: URL: https://github.com/apache/datafusion/pull/14289#issuecomment-2613713471 > An alternative approach would be to add a function like the following to the `ScalarUDFImpl` trait > >```Rust >/// Returns true if the function should return NULL when

Re: [PR] bug: Fix NULL handling in array_slice [datafusion]

2025-01-24 Thread via GitHub
jkosh44 commented on PR #14289: URL: https://github.com/apache/datafusion/pull/14289#issuecomment-2613703184 It looks like a lot of the array functions don't properly handle `NULL`s. For example (I did not exhaustively test them all), ``` > SELECT array_sort([1,2,3], NULL); Inte

[PR] bug: Fix NULL handling in array_slice [datafusion]

2025-01-24 Thread via GitHub
jkosh44 opened a new pull request, #14289: URL: https://github.com/apache/datafusion/pull/14289 This commit fixes the array_slice function so that if any arguments are NULL, the result is NULL. Previously, array_slice would return an internal error if any of the arguments were NULL. This be