fm100 opened a new issue, #22944: URL: https://github.com/apache/datafusion/issues/22944
### Is your feature request related to a problem or challenge? OpenLineage has become a common standard for collecting lineage metadata from processing engines. DataFusion is increasingly used to build query engines, but each DataFusion-based project currently needs to implement lineage extraction independently. This leads to duplicated effort and inconsistent OpenLineage support. ### Describe the solution you'd like I would like DataFusion to expose OpenLineage support, either directly or through stable APIs/hooks that downstream engines can use. Useful metadata to capture would include: * Resolved input and output datasets * Dataset schemas * Column-level lineage, where possible * Logical and/or physical plans, if appropriate * Query metadata such as query ID, status, timing, and errors I do not have a strong preference on the implementation. A separate crate, feature flag, or stable lineage extraction API would all be reasonable options. ### Describe alternatives you've considered Each DataFusion-based engine could implement OpenLineage support independently by inspecting SQL, logical plans, or physical plans. However, this duplicates work, may depend on unstable internals, and can produce inconsistent lineage semantics. ### Additional context OpenLineage integration would make DataFusion more useful as a foundation for production query engines and data platforms, especially for projects that want lineage and observability support without building it from scratch. I am not very familiar with the DataFusion codebase yet, but I would be happy to collaborate with the DataFusion community on the OpenLineage side and help shape the expected metadata/modeling requirements. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
