Thank you for answering all my questions. My suggestion would be to start off
with exposing an API to allow dynamically changing operator parallelism as
the users of flink will be better able to decide the right scaling policy.
Once this functionality is there, its just a matter of providing polici
We do exactly what you mentioned. However, it's not that simple
unfortunately. Our services don't have a predictable performance as traffic
varies a lot during the day.
As I've explained above increase source parallelism to 2 was enough to tip
over our services and reducing parallelism of the asy
I am using the async IO operator. The problem is that increasing source
parallelism from 1 to 2 was enough to tip our systems over the edge.
Reducing the parallelism of async IO operator to 2 is not an option as that
would reduce the throughput quite a bit. This means that no matter what we
do, we'
Yes, exposing an API to adjust the parallelism of individual operators is
definitely a good step towards the auto-scaling feature which we will
consider. The missing piece is persisting this information so that in case
of recovery you don't recover with a completely different parallelism.
I also a
Hi Vishal,
thanks a lot for all your feedback on the new reactive mode. I'll try to
answer your questions.
0. In order to avoid confusion let me quickly explain a bit of terminology:
The reactive mode is the new feature that allows Flink to react to newly
available resources and to make use of th
Yes. While back-pressure would eventually ensure high throughput, hand tuning
parallelism became necessary because the job with high source parallelism
would immediately bring down our internal services - not giving enough time
to flink to adjust the in-rate. Plus running all operators at such a hi
Hi Vishal,
WRT “bring down our internal services” - a common pattern with making requests
to external services is to measure latency, and throttle (delay) requests in
response to increased latency.
You’ll see this discussed frequently on web crawling forums as an auto-tuning
approach.
Typical
Well, I was thinking you could have avoided overwhelming your internal
services by using something like Flink's async i/o operator, tuned to limit
the total number of concurrent requests. That way the pipeline could have
uniform parallelism without overwhelming those services, and then you'd
rely o
Interesting. So if I understand correctly, basically you limited the
parallelism of the sources in order to avoid running the job with constant
backpressure, and then scaled up the windows to maximize throughput.
On Tue, May 4, 2021 at 11:23 PM vishalovercome wrote:
> In one of my jobs, windowin
In one of my jobs, windowing is the costliest operation while upstream and
downstream operators are not as resource intensive. There's another operator
in this job that communicates with internal services. This has high
parallelism as well but not as much as that of the windowing operation.
Running
Could you describe a situation in which hand-tuning the parallelism of
individual operators produces significantly better throughput than the
default approach? I think it would help this discussion if we could have a
specific use case in mind where this is clearly better.
Regards,
David
On Tue, M
Forgot to add one more question - 7. If maxParallelism needs to be set to
control parallelism, then wouldn't that mean that we wouldn't ever be able
to take a savepoint and rescale beyond the configured maxParallelism? This
would mean that we can never achieve hand tuned resource efficient. I will
Some questions about adaptive scheduling documentation - "If new slots become
available the job will be scaled up again, up to the configured
parallelism".
Does parallelism refer to maxParallelism or parallelism? I'm guessing its
the latter because the doc later mentions - "In Reactive Mode (see
13 matches
Mail list logo