I'm happy to help review/advise on how to integrate it with Prism, as time permits.
Based on the design, and that metrics are sent back to the runner, it should be straightforward to add special handling for the lineage metrics to collect them for querying from the Job service handle. There's also an option to perhaps render them with the (very rudimentary) Web UI that prism also provides in standalone mode. Though a cursory look only shows a single OpenLineage implementation in Go, generated from the spec. On Wed, Oct 15, 2025, 11:15 AM Willy Lulciuc <[email protected]> wrote: > Hey Charles, > > I'm the co-creator of OpenLineage and project lead of Marquez (reference > implementation of the OLin spec). I read your proposal. Very exciting stuff! > Let me know how I can help in the initial design phase (or development). I > worked on the Airflow and Spark integrations for OLin and just opened a > proposal to extend the spec to support ML training (see ML support for > OpenLineage <https://github.com/OpenLineage/OpenLineage/issues/4035>) > that I think would be very relevant for Beam. > > Cool to see this work kick off! > > On Tue, Oct 14, 2025 at 12:48 PM Charles Nguyen <[email protected]> > wrote: > >> Hey everyone, >> >> There is this feature [1] to integrate Beam with OpenLineage, and a while >> back I was planning to work on this but never got around to it [2]. I've >> been revisiting this feature and want to take this up again. Please take a >> look at the proposal [3] to support building a lineage graph for Beam's >> local runner(s) and integration with OpenLineage's open standard for >> lineage collection. Any feedback is much appreciated. >> >> The proposal targets Python and Java Direct Runners and Prism Runner, for >> the sake of completeness. But I'm looking for input on which local runner >> should we proceed with as a start. Of course the ultimate goal is to >> support Prism Runner, and there's work already underway to make Prism >> Runner the default local runner for some Python pipelines (which kinda >> makes Python Direct Runner not worthwhile to start with). At the same time, >> Prism Runner is also actively in development and there might be blockers >> that I'm not aware of... >> >> Best, >> Charles >> >> [1] https://github.com/apache/beam/issues/33981 >> [2] https://lists.apache.org/thread/wwm8qnymvoy80lvdkr4p8hwrpdrot9do >> [3] >> https://docs.google.com/document/d/1Styamoo35QSn0mp4iaL8MUfE2r0p1iqmoDDMZdpSlGQ/edit?usp=sharing >> >
