On 26/01/2022 17:10, Hanna Reitz wrote: > On 21.01.22 18:05, Emanuele Giuseppe Esposito wrote: >> block_crypto_amend_options_generic_luks uses the block layer >> permission API, therefore it should be called with the BQL held. >> >> However, the same function is being called by two BlockDriver >> callbacks: bdrv_amend_options (under BQL) and bdrv_co_amend (I/O). >> >> The latter is I/O because it is invoked by block/amend.c's >> blockdev_amend_run(), a .run callback of the amend JobDriver >> >> Therefore we want to 1) change block_crypto driver >> to use the permission API only when the BQL is held, and >> 2) use the .pre_run JobDriver callback to check for >> permissions before switching to the job aiocontext. This has also >> the benefit of applying the same permission operation to all >> amend implementations, not only luks. >> >> Remove the permission check in block_crypto_amend_options_generic_luks() >> and: >> - Add helper functions block_crypto_amend_options_{prepare/cleanup} >> that take care of checking permissions in >> block_crypto_amend_options_luks(), so when it is under BQL, and >> >> - Use job->pre_run() and job->clean() to do the same thing when >> we are in an iothread, by performing these checks before the >> job runs in its aiocontext. So far job->pre_run() is only defined >> but not called in job_start(), now it is the moment to use it. >> >> Signed-off-by: Emanuele Giuseppe Esposito <eespo...@redhat.com> >> --- >> block/crypto.c | 57 ++++++++++++++++++++++++++++++++------------------ >> job.c | 13 ++++++++++++ >> 2 files changed, 50 insertions(+), 20 deletions(-) >> >> diff --git a/block/crypto.c b/block/crypto.c >> index f5e0c7b7c0..bdb4ba5664 100644 >> --- a/block/crypto.c >> +++ b/block/crypto.c >> @@ -791,6 +791,28 @@ block_crypto_amend_cleanup(BlockDriverState *bs) >> crypto->updating_keys = false; >> } >> +static int >> +block_crypto_amend_options_prepare(BlockDriverState *bs, >> + Error **errp) >> +{ >> + BlockCrypto *crypto = bs->opaque; >> + >> + /* apply for exclusive read/write permissions to the underlying >> file*/ >> + crypto->updating_keys = true; >> + return bdrv_child_refresh_perms(bs, bs->file, errp); >> +} >> + >> +static int >> +block_crypto_amend_options_cleanup(BlockDriverState *bs, >> + Error **errp) >> +{ >> + BlockCrypto *crypto = bs->opaque; >> + >> + /* release exclusive read/write permissions to the underlying file*/ >> + crypto->updating_keys = false; >> + return bdrv_child_refresh_perms(bs, bs->file, errp); >> +} >> + > > Now that I have this patch applied, it does look like it would be nicer > if we could skip adding these functions and just reuse > block_crypto_amend_{pre_run,cleanup}() (which would require them to call > bdrv_child_refresh_perms()). > >> static int >> block_crypto_amend_options_generic_luks(BlockDriverState *bs, >> QCryptoBlockAmendOptions >> *amend_options, >> @@ -798,30 +820,17 @@ >> block_crypto_amend_options_generic_luks(BlockDriverState *bs, >> Error **errp) >> { >> BlockCrypto *crypto = bs->opaque; >> - int ret; >> assert(crypto); >> assert(crypto->block); >> - /* apply for exclusive read/write permissions to the underlying >> file*/ >> - crypto->updating_keys = true; >> - ret = bdrv_child_refresh_perms(bs, bs->file, errp); >> - if (ret) { >> - goto cleanup; >> - } >> - >> - ret = qcrypto_block_amend_options(crypto->block, >> - block_crypto_read_func, >> - block_crypto_write_func, >> - bs, >> - amend_options, >> - force, >> - errp); >> -cleanup: >> - /* release exclusive read/write permissions to the underlying file*/ >> - crypto->updating_keys = false; >> - bdrv_child_refresh_perms(bs, bs->file, errp); >> - return ret; >> + return qcrypto_block_amend_options(crypto->block, >> + block_crypto_read_func, >> + block_crypto_write_func, >> + bs, >> + amend_options, >> + force, >> + errp); >> } >> static int >> @@ -847,8 +856,16 @@ block_crypto_amend_options_luks(BlockDriverState >> *bs, >> if (!amend_options) { >> goto cleanup; >> } >> + >> + ret = block_crypto_amend_options_prepare(bs, errp); >> + if (ret) { >> + goto perm_cleanup; >> + } >> ret = block_crypto_amend_options_generic_luks(bs, amend_options, >> force, errp); >> + >> +perm_cleanup: >> + block_crypto_amend_options_cleanup(bs, errp); > > Uh, pre-existing but still dangerous. We must not pass @errp here, > because it may (and if we come from ..._prepare() failing, s/may/will/) > already contain some error, and then, if this fails (which it very > likely will not), we will get an assertion failure in error_setv(). > > We could decide that this must not fail and pass &error_abort (but then > block_crypto_amend_options_cleanup() should do that), or we pass some > new guaranteed-empty pointer and report it. > > In any case, we should probably have > block_crypto_amend_options_cleanup() (or block_crypto_amend_cleanup()) > handle this and have that function return void and no error, so we don’t > have to worry about that here at all. Applied all feedback on crypto and amend (patches 30, 31, 32). All what you said makes sense. > >> cleanup: >> qapi_free_QCryptoBlockAmendOptions(amend_options); >> return ret; >> diff --git a/job.c b/job.c >> index 39bf511949..cf0dc9325a 100644 >> --- a/job.c >> +++ b/job.c >> @@ -967,11 +967,24 @@ static void coroutine_fn job_co_entry(void *opaque) >> aio_bh_schedule_oneshot(qemu_get_aio_context(), job_exit, job); >> } >> +static int job_pre_run(Job *job) >> +{ >> + assert(qemu_in_main_thread()); >> + if (job->driver->pre_run) { >> + return job->driver->pre_run(job, &job->err); >> + } >> + >> + return 0; >> +} >> + >> void job_start(Job *job) >> { >> assert(job && !job_started(job) && job->paused && >> job->driver && job->driver->run); >> job->co = qemu_coroutine_create(job_co_entry, job); >> + if (job_pre_run(job)) { >> + return; > > Do we not need to announce the error somehow? Like, perhaps > job_pre_run() should set job->ret to the value returned by .pre_run() > (like job_co_entry() does it for .run()), and then call job_completed() > on error (or even job_exit()? I’m not sure :/). > > The way it is, it looks like the job will just basically leak on error, > and never complete. > I will do something like this: 1. move job_pre_run to run just before aio_co_enter() in job_start(), because it must be in JOB_STATUS_RUNNING otherwise the transition status in job_exit functions won't probably work. Basically simulate as .run() just finished and failed. 2. change in this + ret = job->driver->pre_run(job, &job->err); + if (ret) { + job->ret = ret; + job_exit(job); + return ret; + } basically cache the reply of pre_run, and if it's non-zero, set job->ret and exit, just as job_co_entry does. Thank you, Emanuele > Hanna > >> + } >> job->pause_count--; >> job->busy = true; >> job->paused = false; >