Hi Robert,

Thanks for getting back to me. We are currently assessing Flink Standalone on 
Kubernetes and Native Flink on Kubernetes and haven't yet decided on which 
model we intend to use. We want to ensure that whichever model we choose, we'll 
be able to get the benefits of the new features added by the community.

>> We are certainly aware that support for active deployments is a much desired 
>> feature. The "problem" with the 1.13 implementation of reactive mode is that 
>> it will try to acquire infinite resources from an active resource manager.

Good point, thanks for explaining why this is a challenge for active mode. I'm 
wondering whether it may be helpful to have a min and max parallelism, and the 
actual parallelism be determined by the scaling policy mentioned next?

>> For integration with an active deployment, how would you like to control the 
>> scaling behavior of Flink? (for example via a REST API call to Flink's 
>> JobManager, or via a programmatic scaling policy, or a configured scaling 
>> policy? If you prefer a scaling policy, which metric would you like to 
>> consider?)

In the long term, I think having some kind of pluggable/extensible scaling 
policy would be best for users to allow flexibility in choosing metrics that 
are important for their use case. Making it configurable might make it easier 
to pick and choose different policies if they are available, without needing to 
make code changes.

Some possible metrics to start with could be related to resource utilization, 
such as CPU, memory, or other characteristics such as how much the job is 
lagging?

Since we are in early stages of just assessing what kind of deployment model 
we'd like to use, it's hard to say what will work best for us. We just want to 
see if reactive mode will be available in the future so that we can leverage it 
when we have more data.

Thanks,
Sonam


________________________________
From: Robert Metzger <rmetz...@apache.org>
Sent: Thursday, March 11, 2021 5:28 AM
To: Sonam Mandal <soman...@linkedin.com>
Cc: user@flink.apache.org <user@flink.apache.org>
Subject: Re: Question about Reactive mode support

Hey Sonam,

I'm very happy to hear that you are interested in reactive mode. Your 
understanding of the limitations for 1.13 is correct. Note that you can deploy 
standalone Flink on Kubernetes [1]. I'm actually currently preparing a demo for 
this [2].

We are certainly aware that support for active deployments is a much desired 
feature. The "problem" with the 1.13 implementation of reactive mode is that it 
will try to acquire infinite resources from an active resource manager.

For integration with an active deployment, how would you like to control the 
scaling behavior of Flink? (for example via a REST API call to Flink's 
JobManager, or via a programmatic scaling policy, or a configured scaling 
policy? If you prefer a scaling policy, which metric would you like to 
consider?)

Best,
Sonam

[1] 
https://ci.apache.org/projects/flink/flink-docs-master/docs/deployment/resource-providers/standalone/kubernetes/<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fci.apache.org%2Fprojects%2Fflink%2Fflink-docs-master%2Fdocs%2Fdeployment%2Fresource-providers%2Fstandalone%2Fkubernetes%2F&data=04%7C01%7Csomandal%40linkedin.com%7C3bd03c356faa4ac2d9f708d8e491a82f%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637510661534528574%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=JocGO%2BfYHQf9ZtjjVjGzG0Mu0o1Oz3u4FTaTZZD3BiU%3D&reserved=0>
[2] 
https://github.com/rmetzger/flink-reactive-mode-k8s-demo<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Frmetzger%2Fflink-reactive-mode-k8s-demo&data=04%7C01%7Csomandal%40linkedin.com%7C3bd03c356faa4ac2d9f708d8e491a82f%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637510661534538566%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=xQJ4z821CA8XmO4S4Wu5AIiR9k4xb9LXahLQOT4sCk4%3D&reserved=0>
 (attention, this is really work in progress!)




On Wed, Mar 10, 2021 at 5:32 PM Sonam Mandal 
<soman...@linkedin.com<mailto:soman...@linkedin.com>> wrote:
Hello,

We were going through 
FlIP-159<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fdisplay%2FFLINK%2FFLIP-159%253A%2BReactive%2BMode&data=04%7C01%7Csomandal%40linkedin.com%7C3bd03c356faa4ac2d9f708d8e491a82f%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637510661534538566%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=FXX%2FhtNZfNXdi5DrQGvFoO0o%2BDVovHLLM1izxLTAL2g%3D&reserved=0>
 and 
FLIP-160<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fdisplay%2FFLINK%2FFLIP-160%253A%2BAdaptive%2BScheduler&data=04%7C01%7Csomandal%40linkedin.com%7C3bd03c356faa4ac2d9f708d8e491a82f%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637510661534548560%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=Yqafz7xR69QOZFcAXZd0OVqmFWzgl%2FsWo3Db4AyGvkw%3D&reserved=0>
 and found this feature of interest to us for auto-scaling purposes. The 
limitations indicate that Flink 1.13 will release this for standalone only and 
for application mode deployments only.

Will this be extended in future releases to other active deployments such as 
Native Flink on Kubernetes? What about session mode?

Thanks,
Sonam

Reply via email to