Dynamically allocating right-sized task resources

Chad Dombrova Sun, 04 Aug 2019 12:40:46 -0700

Hi all,
First time poster, so go easy on me :)

What is Flink's story for accommodating task workloads with vastly
disparate resource requirements: e.g. some require very little CPU and RAM,
while others require quite a lot.


Our current strategy is to bundle resource-intensive tasks and send them to
a different batch-execution framework.  For this we use AWS | Thinkbox
Deadline <https://www.awsthinkbox.com/deadline> [1].  Deadline's scheduler
supports numerous strategies for paring work with a right-sized worker --
criteria (numeric metadata like min/max RAM and CPU requirements) and pools
(basically named resource tags) -- as well as when to schedule tasks --
priorities and limits (a check-in/check-out system for finite resources,
like a software license).  Are others using a similar strategy, or are you
provisioning your task managers for the worst case scenario?

Outsourcing to a separate batch framework for resource intensive tasks
complicates the design of our pipeline and bifurcates our resource pool, so
I'd rather use Flink for the whole process. I searched around and found two
Jira tickets which could form the foundations of a solution to this problem:

- https://issues.apache.org/jira/browse/FLINK-9953:  Active Kubernetes
integration
- https://issues.apache.org/jira/browse/FLINK-10240: Pluggable scheduling
strategy for batch jobs

Sadly, the latter seems to be stalled.

I read the design doc
<https://docs.google.com/document/d/1Zmhui_29VASPcBOEqyMWnF3L6WEWZ4kedrCqya0WaAk/edit>
[2]
for the active K8s integration, and this statement seemed crucial:

> If integrated with k8s natively, Flink’s resource manager will call
kubernetes API to allocate/release new pods and adjust the resource usage
on demand, and then support scale-up and scale down.

This is particularly powerful when your k8s cluster is itself backed by
auto-scaling of nodes (as with GKE autoscaler
<https://cloud.google.com/kubernetes-engine/docs/concepts/cluster-autoscaler>
[3]),
but it's unclear from the doc *when and how* resources are adjusted based
on demand.  Will it simply scale up a shared pool of resource-identical
task managers based on the size of the task backlog (or some other metric
that determines "falling behind"), or does a task have a way of specifying
and acquiring an execution resource that meets its specific performance
profile?

Based on the docs for the YARN resource manager [4], it acquires a pool of
task managers with identical specs, so if this model is also used for the
K8s resource manager, task managers would continue to be provisioned for
the worst-case scenario (particularly in terms of RAM per process), which
for us would mean they are drastically over-provisioned for common tasks.

I'm new to Flink, so there's a good chance I've overlooked something
important, so I'm looking forward to learning more!

-thanks
chad


[1] https://www.awsthinkbox.com/deadline
[2]
https://docs.google.com/document/d/1Zmhui_29VASPcBOEqyMWnF3L6WEWZ4kedrCqya0WaAk/edit
[3]
https://cloud.google.com/kubernetes-engine/docs/concepts/cluster-autoscaler
[4]
https://ci.apache.org/projects/flink/flink-docs-stable/ops/deployment/yarn_setup.html

Dynamically allocating right-sized task resources

Reply via email to