Thank you Charles for the pushing this forward. Looking forward to PRs.
On Tue, Oct 21, 2025 at 6:25 AM Charles Nguyen <[email protected]> wrote: > Thank you very much for the support and feedback! I've put some more > thought into it over the past week, and decided to go forward with Prism > Runner now given the focus for it to be the default local runner. > > It's a bit of a problem that the OpenLineage Go client is available > through a third party and not maintained by OpenLineage, though they do > work after some initial experiment. There's also some work to be done ( > https://github.com/apache/beam/pull/36578) to support the lineage metrics > for Prism Runner, though hopefully it should be straightforward. > > Best, > Charles > > On Wed, Oct 15, 2025 at 6:09 PM Robert Burke <[email protected]> wrote: > >> I'm happy to help review/advise on how to integrate it with Prism, as >> time permits. >> >> Based on the design, and that metrics are sent back to the runner, it >> should be straightforward to add special handling for the lineage metrics >> to collect them for querying from the Job service handle. >> >> There's also an option to perhaps render them with the (very rudimentary) >> Web UI that prism also provides in standalone mode. Though a cursory look >> only shows a single OpenLineage implementation in Go, generated from the >> spec. >> >> On Wed, Oct 15, 2025, 11:15 AM Willy Lulciuc <[email protected]> >> wrote: >> >>> Hey Charles, >>> >>> I'm the co-creator of OpenLineage and project lead of Marquez (reference >>> implementation of the OLin spec). I read your proposal. Very exciting stuff! >>> Let me know how I can help in the initial design phase (or development). >>> I worked on the Airflow and Spark integrations for OLin and just opened a >>> proposal to extend the spec to support ML training (see ML support for >>> OpenLineage <https://github.com/OpenLineage/OpenLineage/issues/4035>) >>> that I think would be very relevant for Beam. >>> >>> Cool to see this work kick off! >>> >>> On Tue, Oct 14, 2025 at 12:48 PM Charles Nguyen <[email protected]> >>> wrote: >>> >>>> Hey everyone, >>>> >>>> There is this feature [1] to integrate Beam with OpenLineage, and a >>>> while back I was planning to work on this but never got around to it [2]. >>>> I've been revisiting this feature and want to take this up again. Please >>>> take a look at the proposal [3] to support building a lineage graph for >>>> Beam's local runner(s) and integration with OpenLineage's open standard for >>>> lineage collection. Any feedback is much appreciated. >>>> >>>> The proposal targets Python and Java Direct Runners and Prism Runner, >>>> for the sake of completeness. But I'm looking for input on which local >>>> runner should we proceed with as a start. Of course the ultimate goal is to >>>> support Prism Runner, and there's work already underway to make Prism >>>> Runner the default local runner for some Python pipelines (which kinda >>>> makes Python Direct Runner not worthwhile to start with). At the same time, >>>> Prism Runner is also actively in development and there might be blockers >>>> that I'm not aware of... >>>> >>>> Best, >>>> Charles >>>> >>>> [1] https://github.com/apache/beam/issues/33981 >>>> [2] https://lists.apache.org/thread/wwm8qnymvoy80lvdkr4p8hwrpdrot9do >>>> [3] >>>> https://docs.google.com/document/d/1Styamoo35QSn0mp4iaL8MUfE2r0p1iqmoDDMZdpSlGQ/edit?usp=sharing >>>> >>>
