Hi,

We  have multiple jobs that need to be deployed to a Flink cluster. Parallelism 
for jobs vary and dependent on the type of work being done  and so are the 
memory requirements. All jobs currently use the same state backend.  Since the 
workloads handled by each job is different, the scaling pattern also varies. We 
run all our jobs in a  single Flink cluster (7 VMs with the same instance 
configuration)

 Most of what I have read in the Flink documentation indicates any of the 
following for setting the task slots

1. As a rule of thumb, a good default number of task slots will be the number 
of CPU cores. With hyper-threading, each slot then takes 2 or more hardware 
thread contexts. If you are doing any Blocking IO operations in Flink job, it 
is suggested to have more number of slots than the core.

2. A Flink cluster needs exactly as many task slots as the highest parallelism 
used in the job. No need to calculate how many tasks (with varying parallelism) 
a program contains in total.

I did not find documentation  for the task slot setting for the scenario I have 
enumerated. While setting a lower value for the task slots seems to work better 
for jobs which need to process high amounts of traffic than the other jobs 
which process lower amounts of traffic, but this will be inefficient if the 
slots are assigned to jobs which work on lower volumes of traffic. 

Depending on the workload handled by each Flink job. rt seems that we would 
need to set as many clusters.  

1. Is this the only option available?
2. Are there any guidelines on deciding on the number of task slots in such an 
environment?

Thanks,
Sushruth

Reply via email to