Re: JdbcIO SQL best practice

2021-04-15 Thread Alexey Romanenko
I don’t think so because this statement [1] is used in this case. [1] https://github.com/apache/beam/blob/97af0775cc19a4997a4b60c6a75d003f8e86cf1f/sdks/java/io/jdbc/src/main/java/org/apache/beam/sdk/io/jdbc/JdbcUtil.java#L56 > On 14 Apr 2021, at 14:44, Thomas Fredriksen(External) > wrote: >

Re: General guidance

2021-04-15 Thread Almeida, Julius
Hi Team, With issue faced in older version of beam with flink runner. I upgraded to beam 2.28 with flink 1.12.1 runner. After running my pipeline for couple of days, I see that the state size is under control but the memory utilization has spike a lot. Configs used : From: "Almeida, Julius"

Re: General guidance

2021-04-15 Thread Almeida, Julius
Hi Team, With issue faced in older version of beam with flink runner. I upgraded to beam 2.28 with flink 1.12.1 runner. I am using rocksdb state backend. After running my pipeline for couple of days, I see that the state size is under control but the memory utilization has spike to full utiliz

Rate Limiting in Beam

2021-04-15 Thread Daniel Thevessen
Hi folks, I've been working on a custom PTransform that makes requests to another service, and would like to add a rate limiting feature there. The fundamental issue that I'm running into here is that I need a decent heuristic to estimate the worker count, so that each worker can independently set

Re: Rate Limiting in Beam

2021-04-15 Thread Evan Galpin
Could you possibly use a side input with fixed interval triggering[1] to query the Dataflow API to get the most recent log statement of scaling as suggested here[2]? [1] https://beam.apache.org/documentation/patterns/side-inputs/ [2] https://stackoverflow.com/a/54406878/6432284 On Thu, Apr 15, 20

Re: Rate Limiting in Beam

2021-04-15 Thread Pablo Estrada
You could implement a Splittable DoFn that generates a limited number of splits. We do something like this for GenerateSequence.from(X).withRate(...) via UnboundedCountingSource[1]. It keeps track of its local EPS, and generates new splits if more EPSs are wanted. This should help you scale up to t