We made some progress to parallelize our python code using beam-spark.
Following your advice, we are using spark 3.2.1
The spark server and worker are connected ok.
In a third machine, the client machine, I am running the docker jobserver:
$ sudo docker run --net=host apache/beam_spark_job_server:
unsubscribe
Try sending this to user-unsubscr...@beam.apache.org
On Mon, 4 Apr 2022 at 17:05, Koosha Hosseiny
wrote:
> unsubscribe
>
Hi Beamers (is that a thing?),
I am relatively new to Beam and am attempting to use the python WriteToMongoDB
transform but ran into some undesirable behavior. The implementation seems to
wait until the entire PCollection has been received to start doing the actual
Mongo writes. My use case req
Hi Beam community,
I am wondering if there is histogram metrics available (or alternative
recommendations) for showing up quantiles. We have counter metrics already but
we would also like to see some quantiles for different values.
Thanks a lot!
Siyu
Siyu - The Beam metrics interface includes the Distribution metric type
which can be used for histograms:
https://beam.apache.org/documentation/programming-guide/#types-of-metrics
Particulars of support depend on the runner. For Cloud Dataflow, the
reported values are MAX, MIN, MEAN, and COUNT, s
Hi Jeff,
Thanks so much for your quick responses. It is unfortunate that histogram is
unavailable in dataflow. Do you know if there are any workaround? Or do you
think it is plausible if we can use runner v2 and customize the image with
Prometheus exporter?
Thanks again!
Siyu
>
> On Apr 4,