Hey everyone,

There is this feature [1] to integrate Beam with OpenLineage, and a while
back I was planning to work on this but never got around to it [2]. I've
been revisiting this feature and want to take this up again. Please take a
look at the proposal [3] to support building a lineage graph for Beam's
local runner(s) and integration with OpenLineage's open standard for
lineage collection. Any feedback is much appreciated.

The proposal targets Python and Java Direct Runners and Prism Runner, for
the sake of completeness. But I'm looking for input on which local runner
should we proceed with as a start. Of course the ultimate goal is to
support Prism Runner, and there's work already underway to make Prism
Runner the default local runner for some Python pipelines (which kinda
makes Python Direct Runner not worthwhile to start with). At the same time,
Prism Runner is also actively in development and there might be blockers
that I'm not aware of...

Best,
Charles

[1] https://github.com/apache/beam/issues/33981
[2] https://lists.apache.org/thread/wwm8qnymvoy80lvdkr4p8hwrpdrot9do
[3]
https://docs.google.com/document/d/1Styamoo35QSn0mp4iaL8MUfE2r0p1iqmoDDMZdpSlGQ/edit?usp=sharing

Reply via email to