Is the problem with the k8s scheduler? Are you using karpenter as well?
When this happens, nodes are not scaling up the same time you are launching
pods right?
The problem of pod startup time is a common one, we could maybe take
something away from how gang scheduling works? This is how spark so
I have replied in the issue with some thoughts on the root causes of ingestion
lag, and pointers to some recent work on one of the root causes. (I believe
there are two roots.)
Gian
On 2025/04/15 09:29:01 Frank Chen wrote:
> Hi Gian and Maytas,
>
> I'm writing this email to you to bring an old