Thank you Charles for the pushing this forward. Looking forward to PRs.


On Tue, Oct 21, 2025 at 6:25 AM Charles Nguyen <[email protected]> wrote:

> Thank you very much for the support and feedback! I've put some more
> thought into it over the past week, and decided to go forward with Prism
> Runner now given the focus for it to be the default local runner.
>
> It's a bit of a problem that the OpenLineage Go client is available
> through a third party and not maintained by OpenLineage, though they do
> work after some initial experiment. There's also some work to be done (
> https://github.com/apache/beam/pull/36578) to support the lineage metrics
> for Prism Runner, though hopefully it should be straightforward.
>
> Best,
> Charles
>
> On Wed, Oct 15, 2025 at 6:09 PM Robert Burke <[email protected]> wrote:
>
>> I'm happy to help review/advise on how to integrate it with Prism, as
>> time permits.
>>
>> Based on the design, and that metrics are sent back to the runner, it
>> should be straightforward to add special handling for the lineage metrics
>> to collect them for querying from the Job service handle.
>>
>> There's also an option to perhaps render them with the (very rudimentary)
>> Web UI that prism also provides in standalone mode. Though a cursory look
>> only shows a single OpenLineage implementation in Go, generated from the
>> spec.
>>
>> On Wed, Oct 15, 2025, 11:15 AM Willy Lulciuc <[email protected]>
>> wrote:
>>
>>> Hey Charles,
>>>
>>> I'm the co-creator of OpenLineage and project lead of Marquez (reference
>>> implementation of the OLin spec). I read your proposal. Very exciting stuff!
>>> Let me know how I can help in the initial design phase (or development).
>>> I worked on the Airflow and Spark integrations for OLin and just opened a
>>> proposal to extend the spec to support ML training (see ML support for
>>> OpenLineage <https://github.com/OpenLineage/OpenLineage/issues/4035>)
>>> that I think would be very relevant for Beam.
>>>
>>> Cool to see this work kick off!
>>>
>>> On Tue, Oct 14, 2025 at 12:48 PM Charles Nguyen <[email protected]>
>>> wrote:
>>>
>>>> Hey everyone,
>>>>
>>>> There is this feature [1] to integrate Beam with OpenLineage, and a
>>>> while back I was planning to work on this but never got around to it [2].
>>>> I've been revisiting this feature and want to take this up again. Please
>>>> take a look at the proposal [3] to support building a lineage graph for
>>>> Beam's local runner(s) and integration with OpenLineage's open standard for
>>>> lineage collection. Any feedback is much appreciated.
>>>>
>>>> The proposal targets Python and Java Direct Runners and Prism Runner,
>>>> for the sake of completeness. But I'm looking for input on which local
>>>> runner should we proceed with as a start. Of course the ultimate goal is to
>>>> support Prism Runner, and there's work already underway to make Prism
>>>> Runner the default local runner for some Python pipelines (which kinda
>>>> makes Python Direct Runner not worthwhile to start with). At the same time,
>>>> Prism Runner is also actively in development and there might be blockers
>>>> that I'm not aware of...
>>>>
>>>> Best,
>>>> Charles
>>>>
>>>> [1] https://github.com/apache/beam/issues/33981
>>>> [2] https://lists.apache.org/thread/wwm8qnymvoy80lvdkr4p8hwrpdrot9do
>>>> [3]
>>>> https://docs.google.com/document/d/1Styamoo35QSn0mp4iaL8MUfE2r0p1iqmoDDMZdpSlGQ/edit?usp=sharing
>>>>
>>>

Reply via email to