Re: [PR] Fix regression in CASE expression [datafusion]

2025-01-27 Thread via GitHub
andygrove commented on PR #14283: URL: https://github.com/apache/datafusion/pull/14283#issuecomment-2617123777 Using `get_coerce_type_for_case_expression` resolved the issue in Comet, so I am closing this PR. Thanks @aweltsch @jayzhan211 @alamb -- This is an automated message from

Re: [PR] Fix regression in CASE expression [datafusion]

2025-01-27 Thread via GitHub
andygrove closed pull request #14283: Fix regression in CASE expression URL: https://github.com/apache/datafusion/pull/14283 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] Fix regression in CASE expression [datafusion]

2025-01-27 Thread via GitHub
alamb commented on PR #14283: URL: https://github.com/apache/datafusion/pull/14283#issuecomment-2616108222 > Ideally, invoking a physical optimization rule to apply coercion as needed would be nice. I agree -- this would be helpful for any system that only uses the Physical layer.

Re: [PR] Fix regression in CASE expression [datafusion]

2025-01-26 Thread via GitHub
andygrove commented on PR #14283: URL: https://github.com/apache/datafusion/pull/14283#issuecomment-2614533646 Ideally, invoking a physical optimization rule to apply coercion as needed would be nice. -- This is an automated message from the Apache Git Service. To respond to the message,

Re: [PR] Fix regression in CASE expression [datafusion]

2025-01-26 Thread via GitHub
andygrove commented on PR #14283: URL: https://github.com/apache/datafusion/pull/14283#issuecomment-2614533302 @jayzhan211 Thanks, I will look into using `get_coerce_type_for_case_expression`. @aweltsch Adding type checks in `try_new` makes sense to me. Tomorrow, I will try upd

Re: [PR] Fix regression in CASE expression [datafusion]

2025-01-26 Thread via GitHub
jayzhan211 commented on PR #14283: URL: https://github.com/apache/datafusion/pull/14283#issuecomment-2614279467 1. I think so, the `try_cast` in `else_expr` seems not ideal to me 2. I agree type checking is a choice if we prefer to return Result instead of panicking, so I'm ok with that.

Re: [PR] Fix regression in CASE expression [datafusion]

2025-01-26 Thread via GitHub
aweltsch commented on PR #14283: URL: https://github.com/apache/datafusion/pull/14283#issuecomment-2614269939 I have two questions: 1. If we assume that the expressions in the `CaseExpr` (both `when_then_expr` and `else_expr`) are of the correct type, can we then not remove all of `t

Re: [PR] Fix regression in CASE expression [datafusion]

2025-01-25 Thread via GitHub
jayzhan211 commented on PR #14283: URL: https://github.com/apache/datafusion/pull/14283#issuecomment-2614154584 The function `coerce_types` is used exclusively within function handling. For case expressions, `coerce_types` is not utilized. Instead, the function `get_coerce_type_for_case_exp

Re: [PR] Fix regression in CASE expression [datafusion]

2025-01-25 Thread via GitHub
andygrove commented on PR #14283: URL: https://github.com/apache/datafusion/pull/14283#issuecomment-2614024359 > How do you handle types mismatch issue? Does Comet has another type handling logic to find the correct types for datafusion physical plan? We map Spark types to Arrow types

Re: [PR] Fix regression in CASE expression [datafusion]

2025-01-24 Thread via GitHub
jayzhan211 commented on PR #14283: URL: https://github.com/apache/datafusion/pull/14283#issuecomment-2613810630 > > ```rust > > let then_expr = try_cast(Arc::clone(e), &batch.schema(), return_type.clone())?; > > ``` > > > > > > > > > > > > > >

Re: [PR] Fix regression in CASE expression [datafusion]

2025-01-24 Thread via GitHub
andygrove commented on PR #14283: URL: https://github.com/apache/datafusion/pull/14283#issuecomment-2613808026 > ```rust > > let then_expr = try_cast(Arc::clone(e), &batch.schema(), return_type.clone())?; > > ``` > > > > I think type handling should be handled co

Re: [PR] Fix regression in CASE expression [datafusion]

2025-01-24 Thread via GitHub
jayzhan211 commented on PR #14283: URL: https://github.com/apache/datafusion/pull/14283#issuecomment-2613803842 ```rust let then_expr = try_cast(Arc::clone(e), &batch.schema(), return_type.clone())?; ``` I think type handling should be handled correctly before physical expressio

Re: [PR] Fix regression in CASE expression [datafusion]

2025-01-24 Thread via GitHub
aweltsch commented on PR #14283: URL: https://github.com/apache/datafusion/pull/14283#issuecomment-2613792833 > If the suggested fix here is to be accepted I would rather remove the `expr_or_expr` specialization again, because from what I have seen the current suggestion regresses on the `c

Re: [PR] Fix regression in CASE expression [datafusion]

2025-01-24 Thread via GitHub
aweltsch commented on PR #14283: URL: https://github.com/apache/datafusion/pull/14283#issuecomment-2613791296 I also looked into this a little bit. Here's my observations: The `data_type` of a `CaseExpr` is the first non-null `data_type`. In the `expr_or_expr` specialization this is only

Re: [PR] Fix regression in CASE expression [datafusion]

2025-01-24 Thread via GitHub
jayzhan211 commented on PR #14283: URL: https://github.com/apache/datafusion/pull/14283#issuecomment-2613677877 In the unit test `issue_14277`, we can see that the `then` is `ScalarValue::Null`. What is the reason it is not `ScalarValue::In32(None)`? ```rust let then = Arc::new(Lit