On 06/11/2017 17:30, Alex Bennée wrote: > Previously the synchronisation of the main thread and the vCPU thread > was ensured by the holding of the BQL. However the trend has been to > reduce the time the BQL was held across the system including under TCG > system emulation. As it is important that batches of events are kept > in sequence (e.g. expiring timers and checkpoints in the main thread > while instruction checkpoints are written by the vCPU thread) we need > another lock to keep things in lock-step. This role is now handled by > the replay_mutex_lock. It used to be held only for each event being > written but now it is held for a whole execution period. This results > in a deterministic ping-pong between the two main threads.
I would remove the last two sentences (which might belong in a commit message, but not in documentation). > As the BQL is now a finer grained lock than the replay_lock it is > almost certainly a bug taking the replay_mutex_lock while the BQL is > held. This is enforced by an assert. While the unlocks are usually in > the reverse order it is not necessary and therefor you can drop the > replay_lock while holding the BQL rather than doing any more > unlock/unlock/lock sequences. As the BQL is now a finer grained lock than the replay_lock it is almost certainly a bug, and a source of deadlocks, to take the replay_mutex_lock while the BQL is held. This is enforced by an assert. While the unlocks are usually in the reverse order, this is not necessary; you can drop the replay_lock while holding the BQL, without doing a more complicated unlock_iothread/replay_unlock/lock_iothread sequence. Paolo