2010YOUY01 commented on code in PR #14025:
URL: https://github.com/apache/datafusion/pull/14025#discussion_r1906203641


##########
datafusion/functions/src/unicode/reverse.rs:
##########
@@ -116,14 +115,23 @@ pub fn reverse<T: OffsetSizeTrait>(args: &[ArrayRef]) -> 
Result<ArrayRef> {
     }
 }
 
-fn reverse_impl<'a, T: OffsetSizeTrait, V: ArrayAccessor<Item = &'a str>>(
+fn reverse_impl<'a, T: OffsetSizeTrait, V: StringArrayType<'a>>(
     string_array: V,
 ) -> Result<ArrayRef> {
-    let result = ArrayIter::new(string_array)
-        .map(|string| string.map(|string: &str| 
string.chars().rev().collect::<String>()))
-        .collect::<GenericStringArray<T>>();
+    let mut builder: GenericStringBuilder<T> =
+        GenericStringBuilder::with_capacity(string_array.len(), 1024);

Review Comment:
   I noticed `get_array_memory_size()` will overestimate 
[source](https://github.com/apache/arrow-rs/blob/4f1f6e57c568fae8233ab9da7d7c7acdaea4112a/arrow-array/src/array/byte_view_array.rs#L600-L607),
 also many other string function is not using the accurate estimation
   
   I think it's okay to keep it simple and use the default size now, perhaps in 
the future we can introduce a function to calculate only payload size in the 
future, and make estimation correct for all usages of `GenericStringBuilder`. 
Thank you for the experiment.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to