09.08.2019 23:13, John Snow wrote: > > > On 8/9/19 9:18 AM, Vladimir Sementsov-Ogievskiy wrote: >> Hi! >> >> Hmm, hacking around backup I have a question: >> >> What prevents guest write request after job_start but before setting >> write notifier? >> >> code path: >> >> qmp_drive_backup or transaction with backup >> >> job_start >> aio_co_enter(job_co_entry) /* may only schedule execution, isn't it >> ? */ >> >> .... >> >> job_co_entry >> job_pause_point() /* it definitely yields, isn't it bad? */ >> job->driver->run() /* backup_run */ >> >> ---- >> >> backup_run() >> bdrv_add_before_write_notifier() >> >> ... >> > > I think you're right... :( > > > We create jobs like this: > > job->paused = true; > job->pause_count = 1; > > > And then job_start does this: > > job->co = qemu_coroutine_create(job_co_entry, job); > job->pause_count--; > job->busy = true; > job->paused = false; > > > Which means that job_co_entry is being called before we lift the pause: > > assert(job && job->driver && job->driver->run); > job_pause_point(job); > job->ret = job->driver->run(job, &job->err); > > ...Which means that we are definitely yielding in job_pause_point. > > Yeah, that's a race condition waiting to happen. > >> And what guarantees we give to the user? Is it guaranteed that write >> notifier is >> set when qmp command returns? >> >> And I guess, if we start several backups in a transaction it should be >> guaranteed >> that the set of backups is consistent and correspond to one point in time... >> > > I would have hoped that maybe the drain_all coupled with the individual > jobs taking drain_start and drain_end would save us, but I guess we > simply don't have a guarantee that all backup jobs WILL have installed > their handler by the time the transaction ends. > > Or, if there is that guarantee, I don't know what provides it, so I > think we shouldn't count on it accidentally working anymore. > > > > I think we should do two things: > > 1. Move the handler installation to creation time. > 2. Modify backup_before_write_notify to return without invoking > backup_do_cow if the job isn't started yet. >
Hmm, I don't see, how it helps.. No-op write-notifier will not save as from guest write, is it? -- Best regards, Vladimir