On Thu, Aug 31, 2023 at 11:12:59AM +0200, Laszlo Ersek wrote:
> On 8/31/23 10:02, Richard W.M. Jones wrote:
> >
> > On Wed, Aug 30, 2023 at 05:21:19PM -0500, Eric Blake wrote:
> >> I hit another transient failure in libnbd CI when a poorly-written
> >> eval script did not consume all of stdin during .pwrite. As behaving
> >> as a data sink can be a somewhat reasonable feature of a
> >> quickly-written sh or eval plugin, we should not be so insistent as
> >> treating an EPIPE failure as an immediate return of EIO to the client.
> >
> > I was thinking about this over night, and came to the conclusion that
> > it's always fine to ignore EPIPE errors.
>
> Interesting; I formed the opposite impression!
>
> > For example a script might
> > be processing the input data gradually and then encounter an error and
> > want to exit immediately. We also have plenty of plugins that discard
> > some or all of the written data.
>
> But that would be associated with a nonzero exit status, right?
In the error case, yes it would exit with a non-zero exit status.
In the "don't care about data" case it would exit with 0.
As a concrete example the code currently has to look like this:
case "$1")
can_write) echo 0 ;;
pwrite)
...
if [ there is an error ]; then
cat >/dev/null # discard stdin
echo 'EIO I/O error' >&2
exit 1
else
cat >/dev/null # discard stdin
exit 0
fi
where we're saying that the 'cat >/dev/null' commands are unnecessary
complication.
There have been cases where we have forgotten to discard stdin on
every exit path and this has caused intermittent test failures in CI:
https://gitlab.com/nbdkit/libnbd/-/commit/4df70870420d1be9348ac45f4aa467501eca5089
https://gitlab.com/nbdkit/libnbd/-/commit/c713529e9fd0641b2d73f764517b5f9c21a767fd
Rich.
> And that way the nbd client would see the pwrite operation as failed.
>
> Laszlo
>
> >
> > So my counter-proposal (coming soon) is going to simply turn the EPIPE
> > error into a debug message and discard the rest of the write buffer.
> >
> > Rich.
> >
> >
> >> Signed-off-by: Eric Blake <ebl...@redhat.com>
> >> ---
> >>
> >> I probably need to add unit test coverage of this before committing
> >> (although proving that I win the data race on a client process exiting
> >> faster than the parent can write enough data to hit EPIPE is hard).
> >>
> >> plugins/sh/nbdkit-sh-plugin.pod | 8 +++++++
> >> plugins/sh/call.c | 38 ++++++++++++++++++++++-----------
> >> 2 files changed, 34 insertions(+), 12 deletions(-)
> >>
> >> diff --git a/plugins/sh/nbdkit-sh-plugin.pod
> >> b/plugins/sh/nbdkit-sh-plugin.pod
> >> index b2c946a0..8b83a5b3 100644
> >> --- a/plugins/sh/nbdkit-sh-plugin.pod
> >> +++ b/plugins/sh/nbdkit-sh-plugin.pod
> >> @@ -505,6 +505,14 @@ Unlike in other languages, if you provide a C<pwrite>
> >> method you
> >> B<must> also provide a C<can_write> method which exits with code C<0>
> >> (true).
> >>
> >> +With nbdkit E<ge> 1.36, this method may return C<0> without consuming
> >> +any data from stdin, and without producing any output, in order to
> >> +behave as an intentional data sink. But in older versions, nbdkit
> >> +would treat any C<EPIPE> failure in writing to your script as an error
> >> +condition even if your script returns success; to avoid unintended
> >> +failures, you may want to include C<"cat >/dev/null"> in a script
> >> +intending to ignore the client's write requests.
> >> +
> >> =item C<flush>
> >>
> >> /path/to/script flush <handle>
> >> diff --git a/plugins/sh/call.c b/plugins/sh/call.c
> >> index 888c6459..79c67a04 100644
> >> --- a/plugins/sh/call.c
> >> +++ b/plugins/sh/call.c
> >> @@ -34,6 +34,7 @@
> >>
> >> #include <assert.h>
> >> #include <fcntl.h>
> >> +#include <stdbool.h>
> >> #include <stdio.h>
> >> #include <stdlib.h>
> >> #include <inttypes.h>
> >> @@ -130,6 +131,7 @@ debug_call (const char **argv)
> >> */
> >> static int
> >> call3 (const char *wbuf, size_t wbuflen, /* sent to stdin (can be NULL) */
> >> + bool *pipe_full, /* set if wbuf not fully written
> >> */
> >> string *rbuf, /* read from stdout */
> >> string *ebuf, /* read from stderr */
> >> const char **argv) /* script + parameters */
> >> @@ -275,15 +277,8 @@ call3 (const char *wbuf, size_t wbuflen, /* sent to
> >> stdin (can be NULL) */
> >> r = write (pfds[0].fd, wbuf, wbuflen);
> >> if (r == -1) {
> >> if (errno == EPIPE) {
> >> - /* We tried to write to the script but it didn't consume
> >> - * the data. Probably the script exited without reading
> >> - * from stdin. This is an error in the script.
> >> - */
> >> - nbdkit_error ("%s: write to script failed because of a broken
> >> pipe: "
> >> - "this can happen if the script exits without "
> >> - "consuming stdin, which usually indicates a bug "
> >> - "in the script",
> >> - argv0);
> >> + *pipe_full = true;
> >> + r = wbuflen;
> >> }
> >> else
> >> nbdkit_error ("%s: write: %m", argv0);
> >> @@ -555,7 +550,7 @@ call (const char **argv)
> >> CLEANUP_FREE_STRING string rbuf = empty_vector;
> >> CLEANUP_FREE_STRING string ebuf = empty_vector;
> >>
> >> - r = call3 (NULL, 0, &rbuf, &ebuf, argv);
> >> + r = call3 (NULL, 0, NULL, &rbuf, &ebuf, argv);
> >> return handle_script_error (argv[0], &ebuf, r);
> >> }
> >>
> >> @@ -568,7 +563,7 @@ call_read (string *rbuf, const char **argv)
> >> int r;
> >> CLEANUP_FREE_STRING string ebuf = empty_vector;
> >>
> >> - r = call3 (NULL, 0, rbuf, &ebuf, argv);
> >> + r = call3 (NULL, 0, NULL, rbuf, &ebuf, argv);
> >> r = handle_script_error (argv[0], &ebuf, r);
> >> if (r == ERROR)
> >> string_reset (rbuf);
> >> @@ -584,7 +579,26 @@ call_write (const char *wbuf, size_t wbuflen, const
> >> char **argv)
> >> int r;
> >> CLEANUP_FREE_STRING string rbuf = empty_vector;
> >> CLEANUP_FREE_STRING string ebuf = empty_vector;
> >> + bool pipe_full = false;
> >>
> >> - r = call3 (wbuf, wbuflen, &rbuf, &ebuf, argv);
> >> + r = call3 (wbuf, wbuflen, &pipe_full, &rbuf, &ebuf, argv);
> >> + if (pipe_full && r == OK) {
> >> + /* We allow scripts to intentionally ignore data, but they must
> >> + * have no output when doing so.
> >> + */
> >> + if (rbuf.len > 0 || ebuf.len > 0) {
> >> + nbdkit_error ("%s: write to script failed because of a broken pipe:
> >> "
> >> + "this can happen if the script exits without "
> >> + "consuming stdin, which usually indicates a bug "
> >> + "in the script",
> >> + argv[0]);
> >> + r = ERROR;
> >> + }
> >> + else
> >> + nbdkit_debug ("%s: write to script failed because of a broken pipe;
> >> "
> >> + "assuming this was an intentional data sink, although
> >> it "
> >> + "may indicate a bug in the script",
> >> + argv[0]);
> >> + }
> >> return handle_script_error (argv[0], &ebuf, r);
> >> }
> >> --
> >> 2.41.0
> >>
> >> _______________________________________________
> >> Libguestfs mailing list
> >> Libguestfs@redhat.com
> >> https://listman.redhat.com/mailman/listinfo/libguestfs
> >
--
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-p2v converts physical machines to virtual machines. Boot with a
live CD or over the network (PXE) and turn machines into KVM guests.
http://libguestfs.org/virt-v2v
_______________________________________________
Libguestfs mailing list
Libguestfs@redhat.com
https://listman.redhat.com/mailman/listinfo/libguestfs