2010YOUY01 commented on code in PR #14025: URL: https://github.com/apache/datafusion/pull/14025#discussion_r1906203641
########## datafusion/functions/src/unicode/reverse.rs: ########## @@ -116,14 +115,23 @@ pub fn reverse<T: OffsetSizeTrait>(args: &[ArrayRef]) -> Result<ArrayRef> { } } -fn reverse_impl<'a, T: OffsetSizeTrait, V: ArrayAccessor<Item = &'a str>>( +fn reverse_impl<'a, T: OffsetSizeTrait, V: StringArrayType<'a>>( string_array: V, ) -> Result<ArrayRef> { - let result = ArrayIter::new(string_array) - .map(|string| string.map(|string: &str| string.chars().rev().collect::<String>())) - .collect::<GenericStringArray<T>>(); + let mut builder: GenericStringBuilder<T> = + GenericStringBuilder::with_capacity(string_array.len(), 1024); Review Comment: I noticed `get_array_memory_size()` will overestimate [source](https://github.com/apache/arrow-rs/blob/4f1f6e57c568fae8233ab9da7d7c7acdaea4112a/arrow-array/src/array/byte_view_array.rs#L600-L607), also many other string function is not using the accurate estimation I think it's okay to keep it simple and use the default size now, perhaps in the future we can introduce a function to calculate only payload size in the future, and make estimation correct for all usages of `GenericStringBuilder`. Thank you for the experiment. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org