On Thu, Oct 03, 2024 at 10:34:28PM +0200, Maciej S. Szmigiero wrote:
> To be clear, these loading threads are mostly blocking I/O threads, NOT
> compute threads.
> This means that the usual "rule of thumb" that the count of threads should
> not exceed the total number of logical CPUs does NOT apply to them.
> 
> They are similar to what glibc uses under the hood to simulate POSIX AIO
> (aio_read(), aio_write()), to implement an async DNS resolver 
> (getaddrinfo_a())
> and what Glib's GIO uses to simulate its own async file operations.
> Using helper threads for turning blocking I/O into "AIO" is a pretty common
> thing.

Fair enough.  Yes I could be over-cautious due to the previous experience
on managing all kinds of migration threads.

> 
> To show that these loading threads mostly spend their time sleeping (waiting
> for I/O) I made a quick patch at [1] tracing how much time they spend waiting
> for incoming buffers and how much time they spend waiting for these buffers
> to be loaded into the device.
> 
> The results (without patch [2] described later) are like this:
> > 5919@1727974993.403280:vfio_load_state_device_buffer_start  (0000:af:00.2)
> > 5921@1727974993.407932:vfio_load_state_device_buffer_start  (0000:af:00.4)
> > 5922@1727974993.407964:vfio_load_state_device_buffer_start  (0000:af:00.5)
> > 5920@1727974993.408480:vfio_load_state_device_buffer_start  (0000:af:00.3)
> > 5920@1727974993.666843:vfio_load_state_device_buffer_end  (0000:af:00.3) 
> > wait 43 ms load 217 ms
> > 5921@1727974993.686005:vfio_load_state_device_buffer_end  (0000:af:00.4) 
> > wait 75 ms load 206 ms
> > 5919@1727974993.686054:vfio_load_state_device_buffer_end  (0000:af:00.2) 
> > wait 69 ms load 210 ms
> > 5922@1727974993.689919:vfio_load_state_device_buffer_end  (0000:af:00.5) 
> > wait 79 ms load 204 ms
> 
> Summing up:
> 0000:af:00.2 total loading time 283 ms, wait 69 ms load 210 ms
> 0000:af:00.3 total loading time 258 ms, wait 43 ms load 217 ms
> 0000:af:00.4 total loading time 278 ms, wait 75 ms load 206 ms
> 0000:af:00.5 total loading time 282 ms, wait 79 ms load 204 ms
> 
> In other words, these threads spend ~100% of their total runtime waiting
> for I/O, 70%-75% of that time waiting for buffers to get loaded into their
> target device.
> 
> So having more threads here won't negatively affect the host CPU
> consumption since these threads barely use the host CPU at all.
> Also, their count is capped at the number of VFIO devices in the VM.
> 
> I also did a quick test with the same config as usual: 4 VFs, 6 multifd
> channels, but with patch at [2] simulating forced coupling of loading
> threads to multifd receive channel threads.
> 
> With this patch load_state_buffer() handler will return to the multifd
> channel thread only when the loading thread finishes loading available
> buffers and is about to wait for the next buffers to arrive - just as
> loading buffers directly from these channel threads would do.
> 
> The resulting lowest downtime from 115 live migration runs was 1295ms -
> that's 21% worse than 1068ms of downtime with these loading threads running
> on their own.
> 
> I expect that this performance penalty to get even worse with more VFs
> than 4.
> 
> So no, we can't load buffers directly from multifd channel receive threads.

6 channels can be a bit less in this test case with 4 VFs, but indeed
adding such dependency on number of multifd threads isn't as good either, I
agree.  I'm ok as long as VFIO reviewers are fine.

> 
> > PS: I'd suggest if you really need those threads it should still be managed
> > by migration framework like the src thread pool.  Sorry I'm pretty stubborn
> > on this, especially after I notice we have query-migrationthreads API just
> > recently.. even if now I'm not sure whether we should remove that API.  I
> > assume that shouldn't need much change, even if necessary.
> 
> I can certainly make these loading threads managed in a thread pool if that's
> easier for you.

Yes, if you want to use separate thread it'll be great to match on the src
thread model with similar pool.  I hope the pool interface you have is
easily applicable on both sides.

Thanks,

-- 
Peter Xu


Reply via email to