Hey everyone, There is this feature [1] to integrate Beam with OpenLineage, and a while back I was planning to work on this but never got around to it [2]. I've been revisiting this feature and want to take this up again. Please take a look at the proposal [3] to support building a lineage graph for Beam's local runner(s) and integration with OpenLineage's open standard for lineage collection. Any feedback is much appreciated.
The proposal targets Python and Java Direct Runners and Prism Runner, for the sake of completeness. But I'm looking for input on which local runner should we proceed with as a start. Of course the ultimate goal is to support Prism Runner, and there's work already underway to make Prism Runner the default local runner for some Python pipelines (which kinda makes Python Direct Runner not worthwhile to start with). At the same time, Prism Runner is also actively in development and there might be blockers that I'm not aware of... Best, Charles [1] https://github.com/apache/beam/issues/33981 [2] https://lists.apache.org/thread/wwm8qnymvoy80lvdkr4p8hwrpdrot9do [3] https://docs.google.com/document/d/1Styamoo35QSn0mp4iaL8MUfE2r0p1iqmoDDMZdpSlGQ/edit?usp=sharing
