Hi Chad,

If I understand correctly, the scenarios you talked about are running batch
jobs, right?

At the moment (Flink 1.8 and earlier), Flink does not differentiate
different working load of tasks. It uses a slot-sharing approach[1] to
balance workloads among workers. The general idea is to put tasks with
different workload into same slot and spread tasks with similar workload to
different ones, resulting in slots with similar total workload. The
approach works fine for streaming jobs, where all the tasks are running at
the same time. However, it might not work that well for batch jobs, were
tasks are scheduled stage by stage.

You can also refers to resource management strategy in the blink branch.
Blink was the internal version Flink in Alibaba, which is open sourced
early this year. It customizes task manager resources (on yarn) according
to tasks' resource requirements. The community and Alibaba are currently in
progress of working together to bring good features of Blink into Flink
master. One of those is fine grained resource management, which could help
resolve resource management and load balancing issues for both streaming
and batch jobs.

[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.8/concepts/runtime.html#task-slots-and-resources

Thank you~

Xintong Song



On Sun, Aug 4, 2019 at 9:40 PM Chad Dombrova <chad...@gmail.com> wrote:

> Hi all,
> First time poster, so go easy on me :)
>
> What is Flink's story for accommodating task workloads with vastly
> disparate resource requirements: e.g. some require very little CPU and RAM,
> while others require quite a lot.
>
> Our current strategy is to bundle resource-intensive tasks and send them
> to a different batch-execution framework.  For this we use AWS | Thinkbox
> Deadline <https://www.awsthinkbox.com/deadline> [1].  Deadline's
> scheduler supports numerous strategies for paring work with a right-sized
> worker -- criteria (numeric metadata like min/max RAM and CPU requirements)
> and pools (basically named resource tags) -- as well as when to schedule
> tasks -- priorities and limits (a check-in/check-out system for finite
> resources, like a software license).  Are others using a similar strategy,
> or are you provisioning your task managers for the worst case scenario?
>
> Outsourcing to a separate batch framework for resource intensive tasks
> complicates the design of our pipeline and bifurcates our resource pool, so
> I'd rather use Flink for the whole process. I searched around and found two
> Jira tickets which could form the foundations of a solution to this problem:
>
> - https://issues.apache.org/jira/browse/FLINK-9953:  Active Kubernetes
> integration
> - https://issues.apache.org/jira/browse/FLINK-10240: Pluggable scheduling
> strategy for batch jobs
>
> Sadly, the latter seems to be stalled.
>
> I read the design doc
> <https://docs.google.com/document/d/1Zmhui_29VASPcBOEqyMWnF3L6WEWZ4kedrCqya0WaAk/edit>
>  [2]
> for the active K8s integration, and this statement seemed crucial:
>
> > If integrated with k8s natively, Flinkā€™s resource manager will call
> kubernetes API to allocate/release new pods and adjust the resource usage
> on demand, and then support scale-up and scale down.
>
> This is particularly powerful when your k8s cluster is itself backed by
> auto-scaling of nodes (as with GKE autoscaler
> <https://cloud.google.com/kubernetes-engine/docs/concepts/cluster-autoscaler> 
> [3]),
> but it's unclear from the doc *when and how* resources are adjusted based
> on demand.  Will it simply scale up a shared pool of resource-identical
> task managers based on the size of the task backlog (or some other metric
> that determines "falling behind"), or does a task have a way of specifying
> and acquiring an execution resource that meets its specific performance
> profile?
>
> Based on the docs for the YARN resource manager [4], it acquires a pool of
> task managers with identical specs, so if this model is also used for the
> K8s resource manager, task managers would continue to be provisioned for
> the worst-case scenario (particularly in terms of RAM per process), which
> for us would mean they are drastically over-provisioned for common tasks.
>
> I'm new to Flink, so there's a good chance I've overlooked something
> important, so I'm looking forward to learning more!
>
> -thanks
> chad
>
>
> [1] https://www.awsthinkbox.com/deadline
> [2]
> https://docs.google.com/document/d/1Zmhui_29VASPcBOEqyMWnF3L6WEWZ4kedrCqya0WaAk/edit
> [3]
> https://cloud.google.com/kubernetes-engine/docs/concepts/cluster-autoscaler
> [4]
> https://ci.apache.org/projects/flink/flink-docs-stable/ops/deployment/yarn_setup.html
>
>
>
>
>

Reply via email to