You can also use Python's RenderRunner, e.g.

  python -m apache_beam.examples.wordcount --output out.txt \
    --runner=apache_beam.runners.render.RenderRunner \
    --render_output=pipeline.svg

This also has an interactive mode, triggered by passing --port=N (where 0
can be used to pick an unused port) which vends the graph as a local web
service. This allows one to expand/collapse composites for easier
exploration. Any --render_output arguments that are passed will get
re-rendered as you edit the graph. (It uses graphviz under the hood, so can
render any of those supported formats.)

For rendering non-Python pipelines, one can start this up as a local
portable "runner"

  python -m apache_beam.runners.render

and then "submit" this job from your other SDK over the jobs API to view
it.

[image: pipeline.png]



On Fri, Sep 1, 2023 at 7:13 AM Joey Tran <joey.t...@schrodinger.com> wrote:

> Perfect, `pipeline_graph` python module in the stack overflow post [1] was
> exactly what I was looking for. The dependencies I'm working with are a bit
> heavyweight and likely difficult to install into a notebook, so I was
> looking for something I could do on my local machine.
>
> Thanks!
> Joey
>
> [1] -
> https://stackoverflow.com/questions/72592971/way-to-visualize-beam-pipeline-run-with-directrunner
>
> On Fri, Sep 1, 2023 at 8:40 AM Danny McCormick via user <
> user@beam.apache.org> wrote:
>
>> Hey Joey,
>>
>> Dataflow and Beam playground are 2 options as you mentioned, locally many
>> SDKs have local runner options with a visual component. For example, in
>> Python you can use the interactive runner with the
>> apache-beam-jupyterlab-sidepanel extension
>> <https://cloud.google.com/dataflow/docs/guides/interactive-pipeline-development#visualize_the_data_through_the_interactive_beam_inspector>
>> to view pipelines visually locally (this is similar to what the notebooks
>> you reference are doing). You can also just call some of these pieces
>> directly
>> <https://stackoverflow.com/questions/72592971/way-to-visualize-beam-pipeline-run-with-directrunner>
>> without an extension. Go has a dot runner
>> <https://pkg.go.dev/github.com/apache/beam/sdks/v2@v2.50.0/go/pkg/beam/runners/dot>
>> that produces a visual representation of a pipeline. Java has a similar dot
>> renderer <https://mehmandarov.com/apache-beam-pipeline-graph/>.
>>
>> Thanks,
>> Danny
>>
>> On Thu, Aug 31, 2023 at 6:38 PM Joey Tran <joey.t...@schrodinger.com>
>> wrote:
>>
>>> Hi all,
>>>
>>> What're all the current options for visualizing a pipeline? I'm guessing
>>> Dataflow has a visualization. I saw that there are also Apache Beam
>>> notebooks through GCP, and I'm aware of the Beam playground, but is there
>>> an easy way to create and view the visualization locally? For example, I
>>> might have a large codebase that's used to construct and run a pipeline,
>>> and in this case I don't think any of those three solutions would be very
>>> easy to use to visualize my pipeline (though I could be wrong)
>>>
>>> Best,
>>> Joey
>>>
>>> --
>>>
>>> Joey Tran | Senior Developer Il | AutoDesigner TL
>>>
>>> *he/him*
>>>
>>> [image: Schrödinger, Inc.] <https://schrodinger.com/>
>>>
>>

Reply via email to