Hi all, First time poster, so go easy on me :) What is Flink's story for accommodating task workloads with vastly disparate resource requirements: e.g. some require very little CPU and RAM, while others require quite a lot.
Our current strategy is to bundle resource-intensive tasks and send them to a different batch-execution framework. For this we use AWS | Thinkbox Deadline <https://www.awsthinkbox.com/deadline> [1]. Deadline's scheduler supports numerous strategies for paring work with a right-sized worker -- criteria (numeric metadata like min/max RAM and CPU requirements) and pools (basically named resource tags) -- as well as when to schedule tasks -- priorities and limits (a check-in/check-out system for finite resources, like a software license). Are others using a similar strategy, or are you provisioning your task managers for the worst case scenario? Outsourcing to a separate batch framework for resource intensive tasks complicates the design of our pipeline and bifurcates our resource pool, so I'd rather use Flink for the whole process. I searched around and found two Jira tickets which could form the foundations of a solution to this problem: - https://issues.apache.org/jira/browse/FLINK-9953: Active Kubernetes integration - https://issues.apache.org/jira/browse/FLINK-10240: Pluggable scheduling strategy for batch jobs Sadly, the latter seems to be stalled. I read the design doc <https://docs.google.com/document/d/1Zmhui_29VASPcBOEqyMWnF3L6WEWZ4kedrCqya0WaAk/edit> [2] for the active K8s integration, and this statement seemed crucial: > If integrated with k8s natively, Flinkās resource manager will call kubernetes API to allocate/release new pods and adjust the resource usage on demand, and then support scale-up and scale down. This is particularly powerful when your k8s cluster is itself backed by auto-scaling of nodes (as with GKE autoscaler <https://cloud.google.com/kubernetes-engine/docs/concepts/cluster-autoscaler> [3]), but it's unclear from the doc *when and how* resources are adjusted based on demand. Will it simply scale up a shared pool of resource-identical task managers based on the size of the task backlog (or some other metric that determines "falling behind"), or does a task have a way of specifying and acquiring an execution resource that meets its specific performance profile? Based on the docs for the YARN resource manager [4], it acquires a pool of task managers with identical specs, so if this model is also used for the K8s resource manager, task managers would continue to be provisioned for the worst-case scenario (particularly in terms of RAM per process), which for us would mean they are drastically over-provisioned for common tasks. I'm new to Flink, so there's a good chance I've overlooked something important, so I'm looking forward to learning more! -thanks chad [1] https://www.awsthinkbox.com/deadline [2] https://docs.google.com/document/d/1Zmhui_29VASPcBOEqyMWnF3L6WEWZ4kedrCqya0WaAk/edit [3] https://cloud.google.com/kubernetes-engine/docs/concepts/cluster-autoscaler [4] https://ci.apache.org/projects/flink/flink-docs-stable/ops/deployment/yarn_setup.html