On 7/13/23 21:29, Eric Blake wrote: > The documentation has claimed since commit 6f4dcdab that any > completion callback will be called exactly once; but this is not > consistent with the code: if nbd_aio_* itself returns an error, then > nothing is queued and the user does not need to wait for a completion > callback to know how the command failed. We could tweak the generator > to call completion.callback no matter what, but since the > completion.free callback already serves that role, it's easier to fix > the documentation to match reality. After all, one only needs > completion status if an aio command returned success (if it returned > failure, we know that there is nothing that is going to complete > later). > > However, there was one place where we indeed fail to call > completion.callback, even though the corresponding aio call returned > success, which can strand a user that was depending on the callback to > know that the pending aio command failed after all. That's when a > call to nbd_close() interrupts a connection while commands are in > flight. This problem appears to have been around even before commit > 52b9b492 (v0.9.8) when we finally settled on having .free callbacks in > the first place. > > Beef up the closure-lifetimes unit test to more robustly check the > various conditions guaranteed by the updated documentation, and to > expose the previous skip of a completion callback during nbd_close. > > In summary, the behavior we want (where sequence is important) is: > > - aio command fails: > mid-command .callback: 0 calls > completion .callback: 0 calls > mid-command .free: 1 call > completion .free: 1 call > > - aio command succeeds: > mid-command .callback: 0, 1, or multiple calls > completion .callback: 1 call > mid-command .free: 1 call > completion .free: 1 call > > Reported-by: Tage Johansson <tage.j.li...@posteo.net> > Fixes: 6f4dcdab ("docs: Clarify how callbacks should handle errors", v1.11.8) > Signed-off-by: Eric Blake <ebl...@redhat.com> > --- > docs/libnbd.pod | 26 ++++++++++++++++---------- > lib/handle.c | 3 +++ > tests/closure-lifetimes.c | 14 ++++++++++++++ > 3 files changed, 33 insertions(+), 10 deletions(-) > > diff --git a/docs/libnbd.pod b/docs/libnbd.pod > index 72f74053..433479e6 100644 > --- a/docs/libnbd.pod > +++ b/docs/libnbd.pod > @@ -904,9 +904,12 @@ same nbd object, as it would cause deadlock. > =head2 Completion callbacks > > All of the asychronous commands have an optional completion callback > -function that is used right before the command is marked complete, > -after any mid-command callbacks have finished, and before any free > -functions. > +function that is used if the asynchronous command succeeded, right > +before the command is marked complete, after any mid-command callbacks > +have finished, and before any free functions. The completion callback > +is not reached if the asynchronous command itself fails, while free > +functions are reached regardless of the initial result of the > +asynchronous command. > > When the completion callback returns C<1>, the command is > automatically retired (there is no need to call
I agree with this approach (i.e., with adapting the documentation to reality), but I find the language somewhat confusing. We have three terms: - "asynchronous command succeeds" - "command is marked complete" - "command is retired" The last two are mostly interchangeable in my view, and are also *not* confusing in the documentation. But the first term is confusing, IMO; it can easily be mistaken for meanings #2/#3. What we mean by #1 instead is the successful queueing (or submission) of the command. The text below does say "queued", but the text above doesn't. So I'd suggest replacing term#1 above with "call to asynchronous API succeeds" or "asynchronous command is successfully submitted" or something like that. (Side comment: I'd kinda prefer an all-or-nothing approach for async APIs. If the API fails at once, it should not take ownership of anything; i.e., it shouldn't call either completion or free callbacks. And if it succeeds, then it should take complete ownership. I'm not suggesting to rework callbacks whole-sale for this, though.) With the language clarified a bit: Acked-by: Laszlo Ersek <ler...@redhat.com> Thanks Laszlo > @@ -946,13 +949,16 @@ mid-command callback may be reached more times than > expected if the > server is non-compliant. > > On the other hand, if a completion callback is supplied (only possible > -with asynchronous commands), it will always be reached exactly once, > -and the completion callback must not ignore the value pointed to by > -C<error>. In particular, the content of a buffer passed to > -L<nbd_aio_pread(3)> or L<nbd_aio_pread_structured(3)> is undefined > -if C<*error> is non-zero on entry to the completion callback. It is > -recommended that if you choose to use automatic command retirement > -(where the completion callback returns C<1> to avoid needing to call > +with asynchronous commands), it will not be called if the initial API > +call fails (such as attempting an asynchronous command in the wrong > +state - there is nothing to be completed since the command was not > +queued), but will otherwise be called exactly once, and the completion > +callback must not ignore the value pointed to by C<error>. In > +particular, the content of a buffer passed to L<nbd_aio_pread(3)> or > +L<nbd_aio_pread_structured(3)> is undefined if C<*error> is non-zero > +on entry to the completion callback. It is recommended that if you > +choose to use automatic command retirement (where the completion > +callback returns C<1> to avoid needing to call > L<nbd_aio_command_completed(3)> later), your completion function > should return C<1> on all control paths, even when handling errors > (note that with automatic retirement, assigning into C<error> is > diff --git a/lib/handle.c b/lib/handle.c > index 0f11bee5..1d66d585 100644 > --- a/lib/handle.c > +++ b/lib/handle.c > @@ -37,7 +37,10 @@ free_cmd_list (struct command *list) > struct command *cmd, *cmd_next; > > for (cmd = list; cmd != NULL; cmd = cmd_next) { > + int error = cmd->error ? : ENOTCONN; > + > cmd_next = cmd->next; > + CALL_CALLBACK (cmd->cb.completion, &error); > nbd_internal_retire_and_free_command (cmd); > } > } > diff --git a/tests/closure-lifetimes.c b/tests/closure-lifetimes.c > index 9bb4e120..23691631 100644 > --- a/tests/closure-lifetimes.c > +++ b/tests/closure-lifetimes.c > @@ -66,6 +66,7 @@ read_cb (void *opaque, > uint64_t offset, unsigned status, int *error) > { > assert (!read_cb_freed); > + assert (!completion_cb_called); > read_cb_called++; > return 0; > } > @@ -73,6 +74,7 @@ read_cb (void *opaque, > static void > read_cb_free (void *opaque) > { > + assert (!completion_cb_freed); > read_cb_freed++; > } > > @@ -81,6 +83,7 @@ block_status_cb (void *opaque, const char *meta, uint64_t > offset, > uint32_t *entries, size_t nr_entries, int *error) > { > assert (!block_status_cb_freed); > + assert (!completion_cb_called); > block_status_cb_called++; > return 0; > } > @@ -88,6 +91,7 @@ block_status_cb (void *opaque, const char *meta, uint64_t > offset, > static void > block_status_cb_free (void *opaque) > { > + assert (!completion_cb_freed); > block_status_cb_freed++; > } > > @@ -150,6 +154,10 @@ main (int argc, char *argv[]) > cookie = nbd_aio_pread_structured (nbd, buf, sizeof buf, 0, chunk_callback, > completion_callback, 0); > if (cookie == -1) NBD_ERROR; > + /* read_cb_called is indeterminate at this point, as state machine > + * progress may vary based on task schduling and network speed factors. > + */ > + assert (completion_cb_called == 0); > assert (read_cb_freed == 0); > assert (completion_cb_freed == 0); > while (!nbd_aio_command_completed (nbd, cookie)) { > @@ -179,6 +187,8 @@ main (int argc, char *argv[]) > nbd_kill_subprocess (nbd, 0); > nbd_close (nbd); > > + /* read_cb_called is indeterminate based on timing of kill. */ > + assert (completion_cb_called == 1); > assert (read_cb_freed == 1); > assert (completion_cb_freed == 1); > > @@ -198,6 +208,8 @@ main (int argc, char *argv[]) > fprintf (stderr, "%s: Expecting block_status failure\n", argv[0]); > exit (EXIT_FAILURE); > } > + assert (block_status_cb_called == 0); > + assert (completion_cb_called == 0); > assert (block_status_cb_freed == 1); > assert (completion_cb_freed == 1); > > @@ -212,6 +224,8 @@ main (int argc, char *argv[]) > fprintf (stderr, "%s: Expecting block_status failure\n", argv[0]); > exit (EXIT_FAILURE); > } > + assert (block_status_cb_called == 0); > + assert (completion_cb_called == 0); > assert (block_status_cb_freed == 1); > assert (completion_cb_freed == 1); > _______________________________________________ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs