On Thu, Jul 22, 2021 at 06:09:03PM +0100, Dr. David Alan Gilbert wrote: > * Peter Xu (pet...@redhat.com) wrote: > > On Thu, Jul 22, 2021 at 04:27:35PM +0100, Dr. David Alan Gilbert wrote: > > > > @@ -3352,6 +3355,8 @@ static MigThrError postcopy_pause(MigrationState > > > > *s) > > > > > > > > /* Current channel is possibly broken. Release it. */ > > > > assert(s->to_dst_file); > > > > + /* Unregister yank for current channel */ > > > > + migration_ioc_unregister_yank_from_file(s->to_dst_file); > > > > > > Should this go inside the lock? > > > > Shouldn't need to; as we've got the assert() right above so otherwise we'll > > abrt otherwise :) > > Hmm OK; although with out always having to think about it then you worry > about something getting in after the assert.
Right; the point is still that to_dst_file shouldn't be changed by other thread, or it'll bring some more challenge. If it can be mnodified when reaching this line, it means it can also reach earlier at assert(), then we coredump. So we should guaratee it won't happen, or we'd better also remove that assertion.. I think the challenge here is, we do have a mutex to protect the files and from that pov it's the same as other mutex. However it's different in that this mutex is also used in the oob handlers so we want to "keep it non-blocking and as small as possible for the critical sections". I didn't put it into the mutex protection because the yank code will further go to take the yank_lock so it'll make things slightly more complicated (then the file mutex is correlated to yank_lock too!). I don't think that's a problem because yank_lock is also "non-blocking" (ah! it can still block, got scheduled out, but there's no explicit things that will proactively sleep..). So since I think it's safe without the lock then I kept it outside of the lock then we don't need to discuss the safely of having it inside the critical section. (However then I noticed we still need justification on why it can be moved out..) > > > The mutex lock/unlock right below this one is not protecting us from someone > > changing it but really for being able to wait until someone finished using > > it > > then we won't crash someone else. > > > > I think the rational is to_dst_file is managed by migration thread while > > from_dst_file is managed by rp_thread. > > > > Maybe I add a comment above? > > OK, I almost pushed this further, but then I started to get worried that > we'd trace off a race with ordering on two locks with yank_lock; so yeh > lets just add a comment. Agreed. I think ordering is not a huge problem as yank_lock is very limitedly used in yank and protect yank instance/functions only, so there shouldn't be a path we take file mutex within it. Will repost shortly, thanks. -- Peter Xu