friendlymatthew commented on PR #15361:
URL: https://github.com/apache/datafusion/pull/15361#issuecomment-2783424002

   It may be beneficial to leave the existing code paths undisturbed and decide 
whether to retry on err. 
   
   It avoids the overhead of custom `chrono::Strftime::parse` validation. Plus, 
the retry logic for `to_char_scalar` is simple to implement. Something like:
   
   <details>
   <summary>to_char_scalar</summary>
   <br>
   
   ```rust
     let formatter = ArrayFormatter::try_new(array.as_ref(), &format_options)?;
     let formatted: Result<Vec<Option<String>>, ArrowError> = (0..array.len())
         .map(|i| {
             if array.is_null(i) {
                 Ok(None)
             } else {
                 formatter.value(i).try_to_string().map(Some)
             }
         })
         .collect(); /*
                              will stop iterating upon the first err and 
                              since we're dealing with one format string, 
                              it'll halt after the first format attempt
                       */
   
     if let Ok(formatted) = formatted {
         if is_scalar_expression {
             Ok(ColumnarValue::Scalar(ScalarValue::Utf8(
                 formatted.first().unwrap().clone(),
             )))
         } else {
             Ok(ColumnarValue::Array(
                 Arc::new(StringArray::from(formatted)) as ArrayRef
             ))
         }
     } else {
         // if the format attempt is with a Date32, it's possible the attempt 
failed because
         // the format string contained time-specifiers, so we'll retry as 
Date64s
         if data_type == &Date32 {
             return to_char_scalar(expression.clone().cast_to(&Date64, None)?, 
format);
         }
   
         exec_err!("{}", formatted.unwrap_err())
     }
   ```
   </details>
   
   `to_char_array` is a bit more complex (see 
https://github.com/apache/datafusion/pull/15361#discussion_r2015455219), but 
I'll think it through.
   
   FWIW, it's worth implementing the retry logic and benchmarking the 
performance. I'm happy to do the work. The decision between eager casting and 
selective retry likely comes down to whether we're willing to slightly slow 
down existing behavior in exchange for better performance in the new case. But 
I'm just spitballing.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to