Re: [PATCH v5 13/21] migration: Postcopy recover with preempt enabled

Peter Xu Mon, 16 May 2022 09:24:29 -0700

On Mon, May 16, 2022 at 08:21:23PM +0530, manish.mishra wrote:
> 
> On 16/05/22 7:41 pm, Peter Xu wrote:
> > Hi, Manish,
> > 
> > On Mon, May 16, 2022 at 07:01:35PM +0530, manish.mishra wrote:
> > > On 26/04/22 5:08 am, Peter Xu wrote:
> > > LGTM,
> > > Peter, I wanted to give review-tag for this and ealier patch too. I am new
> > > to qemu
> > > review process so not sure how give review-tag, did not find any reference
> > > on
> > > google too. So if you please let me know how to do it.
> > It's here:
> > 
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__git.qemu.org_-3Fp-3Dqemu.git-3Ba-3Dblob-3Bf-3Ddocs_devel_submitting-2Da-2Dpatch.rst-3Bhb-3DHEAD-23l492&d=DwIBaQ&c=s883GpUCOChKOHiocYtGcg&r=c4KON2DiMd-szjwjggQcuUvTsPWblztAL0gVzaHnNmc&m=8LU6rphEJ5GMSXEpSxe8JZ_hpn6TQDUXfjWM6Vt7DdShxnU3X5zYXbAMBLPYchdK&s=TUNUCtdl7LWhrdlfnIx1F08kC0d9IMvArl6cNMpfXkc&e=
> > 
> > Since afaict QEMU is mostly following what Linux does, you can also
> > reference to this one with more context:
> > 
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__www.kernel.org_doc_html_v4.17_process_submitting-2Dpatches.html-23using-2Dreported-2Dby-2Dtested-2Dby-2Dreviewed-2Dby-2Dsuggested-2Dby-2Dand-2Dfixes&d=DwIBaQ&c=s883GpUCOChKOHiocYtGcg&r=c4KON2DiMd-szjwjggQcuUvTsPWblztAL0gVzaHnNmc&m=8LU6rphEJ5GMSXEpSxe8JZ_hpn6TQDUXfjWM6Vt7DdShxnU3X5zYXbAMBLPYchdK&s=TJmr_eC4LAccVY1EqgkLleXfJhUgtIjTJmLc3cedYr0&e=
> > 
> > But since you're still having question regarding this patch, no rush on
> > providing your R-bs; let's finish the discussion first.
> > 
> > [...]
> > 
> > > > +static void postcopy_pause_ram_fast_load(MigrationIncomingState *mis)
> > > > +{
> > > > +    trace_postcopy_pause_fast_load();
> > > > +    qemu_mutex_unlock(&mis->postcopy_prio_thread_mutex);
> > > I may have misunderstood synchronisation here but very very rare chance,
> > > 
> > > as both threads are working independently is it possible qemu_sem_post on
> > > 
> > > postcopy_pause_sem_fast_load is called by main thread even before we go
> > > 
> > > to qemu_sem_wait in next line, causing some kind of deadlock. That's 
> > > should
> > > 
> > > not be possible as i understand it requires manually calling
> > > qmp_migration_recover
> > > 
> > > so chances are almost impossible. Just wanted to confirm it.
> > Sorry I don't quite get the question, could you elaborate?  E.g., when the
> > described deadlock happened, what is both main thread and preempt load
> > thread doing?  What are they waiting at?
> > 
> > Note: we have already released mutex before waiting on sem.
> 
> What i meant here is deadlock could be due the reason that we infinately wait
> 
> on qemu_sem_wait(&mis->postcopy_pause_sem_fast_load), because main
> 
> thread already called post on postcopy_pause_sem_fast_load after recovery
> 
> even before we moved to qemu_sem_wait(&mis->postcopy_pause_sem_fast_load)
> 
> in next line. Basically if we miss a post on postcopy_pause_sem_fast_load.
> 
> This is nearly impossibily case becuase it requires full recovery path to be 
> completed
> 
> before this thread executes just next line. Also as recovery needs to be 
> called manually,
> 
> So please ignore this.


Yes the migration state has a dependency.

The other more obvious reason it won't happen is that sem is number based
and it remembers.  Please try below:

    sem_t *sem = sem_open("test", O_CREAT);
    sem_post(sem);
    sem_wait(sem);

And sem_wait() will return immediately because post() already set it to 1.

> 
> Basically i wanted to check if we should use something like
> 
> int pthread_cond_wait(pthread_cond_t *restrict cond,
>                    pthread_mutex_t *restrict mutex);
> 
> so that there is no race between releasing mutex and calling wait.

Yes I think condvar should also work here but it should be stricter.

Thanks,

-- 
Peter Xu

Re: [PATCH v5 13/21] migration: Postcopy recover with preempt enabled

Reply via email to