ctsk commented on issue #16206: URL: https://github.com/apache/datafusion/issues/16206#issuecomment-2988799373
I've re-encountered this issue and it (obviously, duh) gets amplified with larger scale facctors (>100). For example, the top 3 items by cycles when running tpch query 18 @ sf 300 are: ``` 18.07% tokio-runtime-w tpch [.] arrow_select::take::take_byte_view ◆ 14.52% tokio-runtime-w tpch [.] <datafusion_physical_plan::coalesce_batches::CoalesceBatchesStream as futures_core::stream::Stream>::poll_next ▒ 12.96% tokio-runtime-w tpch [.] core::ptr::drop_in_place<alloc::vec::Vec<arrow_buffer::buffer::immutable::Buffer>> ``` That is close to 50% of cycles spent due to this issue. Because the batches that get produced by the hash join have a lot of data buffers, the CoalesceBatchesExec gets more expensive too: Inside the operator, we iterate over the data buffers to determine whether to perform a garbage collection. I believe a reasonable fix for this issue is to perform a garbage collection on the build side after the concatenation. The build side would then only hold a single data buffer per string-view / byte-view column. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org