On Thu, Mar 07, 2024 at 03:30:30PM +0000, Daniel P. Berrangé wrote:
> I wonder if something is hitting the 'max_client_requests' limit and
> getting stalled.
>
> The initial thread message here says the lockup is happening during
> bulk concurrent live migrations of 200 VMs, 5 at a time.
>
> The default 'max_client_requests' is 5.... DANGER WILL ROBINSON...
>
> With live migration making requests across multiple libvirt daemons,
> if the target host has filled its 5 requests queue with long running
> operations, and then a "prepare migrate' call comes in, that'll get
> stalled behind a possibly slow operation at the RPC dispatch level.
>
> I'd suggest bumping 'max_client_requests' to 100 and seeing if the
> problem goes away.
>
> If so I wonder if we shouldn't raise our out of the box limits.
> '5' is pretty low considering the scale of virtualization hosts
> in the modern world, and where even my laptop has 20 CPUs and
> 64 GB of RAM.

FWIW I was running a simple workload inside KubeVirt (a test case
that's part of its functional test suite and involves spawning and
subsequently migrating a single VM) yesterday and I could see
warnings about hitting max_client_requests in the logs.

-- 
Andrea Bolognani / Red Hat / Virtualization
_______________________________________________
Users mailing list -- users@lists.libvirt.org
To unsubscribe send an email to users-le...@lists.libvirt.org

Reply via email to