Am 02.05.2022 um 10:02 hat Emanuele Giuseppe Esposito geschrieben: > > > Am 30/04/2022 um 07:17 schrieb Stefan Hajnoczi: > > On Thu, Apr 28, 2022 at 11:56:09PM +0200, Emanuele Giuseppe Esposito wrote: > >> > >> > >> Am 28/04/2022 um 12:45 schrieb Stefan Hajnoczi: > >>> On Wed, Apr 27, 2022 at 08:55:35AM +0200, Emanuele Giuseppe Esposito > >>> wrote: > >>>> > >>>> > >>>> Am 26/04/2022 um 10:51 schrieb Emanuele Giuseppe Esposito: > >>>>> Luckly, most of the cases where we recursively go through a graph are > >>>>> the BlockDriverState callback functions in block_int-common.h > >>>>> In order to understand what to protect, I categorized the callbacks in > >>>>> block_int-common.h depending on the type of function that calls them: > >>>>> > >>>>> 1) If the caller is a generated_co_wrapper, this function must be > >>>>> protected by rdlock. The reason is that generated_co_wrapper create > >>>>> coroutines that run in the given bs AioContext, so it doesn't matter > >>>>> if we are running in the main loop or not, the coroutine might run > >>>>> in an iothread. > >>>>> 2) If the caller calls it directly, and has the GLOBAL_STATE_CODE() > >>>>> macro, > >>>>> then the function is safe. The main loop is the writer and thus won't > >>>>> read and write at the same time. > >>>>> 3) If the caller calls it directly, but has not the GLOBAL_STATE_CODE() > >>>>> macro, then we need to check the callers and see case-by-case if the > >>>>> caller is in the main loop, if it needs to take the lock, or delegate > >>>>> this duty to its caller (to reduce the places where to take it). > >>>>> > >>>>> I used the vrc script (https://github.com/bonzini/vrc) to get help > >>>>> finding > >>>>> all the callers of a callback. Using its filter function, I can > >>>>> omit all functions protected by the added lock to avoid having > >>>>> duplicates > >>>>> when querying for new callbacks. > >>>> > >>>> I was wondering, if a function is in category (3) and runs in an > >>>> Iothread but the function itself is not (currently) recursive, meaning > >>>> it doesn't really traverse the graph or calls someone that traverses it, > >>>> should I add the rdlock anyways or not? > >>>> > >>>> Example: bdrv_co_drain_end > >>>> > >>>> Pros: > >>>> + Covers if in future a new recursive callback for a new/existing > >>>> BlockDriver is implemented. > >>>> + Covers also the case where I or someone missed the recursive part. > >>>> > >>>> Cons: > >>>> - Potentially introducing an unnecessary critical section. > >>>> > >>>> What do you think? > >>> > >>> ->bdrv_co_drain_end() is a callback function. Do you mean whether its > >>> caller, bdrv_drain_invoke_entry(), should take the rdlock around > >>> ->bdrv_co_drain_end()? > >> > >> Yes. The problem is that the coroutine is created in bs AioContext, so > >> it might be in an iothread. > >> > >>> > >>> Going up further in the call chain (and maybe switching threads), > >>> bdrv_do_drained_end() has QLIST_FOREACH(child, &bs->children, next) so > >>> it needs protection. If the caller of bdrv_do_drained_end() holds then > >>> rdlock then I think none of the child functions (including > >>> ->bdrv_co_drain_end()) need to take it explicitly. > >> > >> Regarding bdrv_do_drained_end and similar, they are either running in > >> the main loop (or they will be, if coming from a coroutine) or in the > >> iothread running the AioContext of the bs involved. > >> > >> I think that most of the drains except for mirror.c are coming from main > >> loop. I protected mirror.c in patch 8, even though right now I am not > >> really sure that what I did is necessary, since the bh will be scheduled > >> in the main loop. > >> > >> Therefore we don't really need locks around drains. > > > > Are you saying rdlock isn't necessary in the main loop because nothing > > can take the wrlock while our code is executing in the main loop? > > Yes, that's the idea. > If I am not mistaken (and I hope I am not), only the main loop currently > modifies/is allowed to modify the graph.
Aren't you completely ignoring coroutines in this? What would a coroutine do that requires the graph not to change across a yield? (It's not easily possible to protect this today, and I think this was the source of some bugs in the past. But if we introduce some proper locking, I would expect it to solve this.) Kevin