alamb commented on issue #14816: URL: https://github.com/apache/datafusion/issues/14816#issuecomment-2676186699
> Are there existing mechanisms in DataFusion to handle external iterators or row sources? There is a PR we are currently working on related to metadata columns (which could provide row ids perhaps) - https://github.com/apache/datafusion/pull/14057 > What are the best practices for integrating DataFusion with external data sources in a streaming or batched manner? > Are there any plans or ongoing work in the DataFusion project that might address this use case? > Any alternative approaches or design patterns that might help us achieve efficient row selection in our multi-engine implementation? I think you should check out https://github.com/datafusion-contrib/datafusion-federation which has a variety of items that are used for building a federated query engine @philippemnoel may also have ideas / suggestions for this -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org