kfaraz commented on PR #19549:
URL: https://github.com/apache/druid/pull/19549#issuecomment-4775762357

   @Fly-Style , I agree with @jtuglu1 .
   Let's keep this PR on hold.
   
   There is not a strong case to be made for intermediate task counts because, 
with say 500 partitions, going from 250 tasks to 300 tasks would still leave 
several tasks processing 2 partitions each, thus not helping with the lag.
   Allowing intermediate task counts in `costBased` auto-scaler would take us 
back in the direction of the old `lagBased` auto-scaler, which doesn't seem 
desirable.
   
   In terms of cost, while 300 tasks would cost less than 500, there is no 
point in trying 300 if we know that they are not going to keep up. In fact, it 
would only delay the scale up to 500, which may also be a cost/SLA concern for 
the operator.
   
   The only case where 300 tasks would help is if there is skew in the volume 
of data for each partition. But the correct solution for that would be to 
identify the slow tasks/partitions and try to focus the scale up only on those.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to