Tushar7012 commented on PR #20023: URL: https://github.com/apache/datafusion/pull/20023#issuecomment-3804411253
All CI checks passing Regarding the Copilot review comment about memory trade-off: I've added documentation in the code explaining this intentional design decision. The parallelization using `JoinSet` requires collecting files per table_path into memory because spawned tasks need `'static` lifetime, which prevents returning borrowed streams directly. This is an acceptable trade-off because: 1. The parallelization benefit outweighs the temporary memory overhead for most use cases 2. The WASM fallback preserves streaming behavior for memory-constrained environments 3. Files are collected per-path (not all at once), limiting peak memory usage Ready for review! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
