Re: [Qemu-devel] [PATCH RFC v5 1/7] Fix segmentation fault when qemu_signal_init fails

Markus Armbruster Fri, 12 Oct 2018 06:27:31 -0700

Fei Li <f...@suse.com> writes:

> On 10/12/2018 03:56 PM, Markus Armbruster wrote:
>> Fei Li <f...@suse.com> writes:
>>
>>> On 10/11/2018 06:02 PM, Markus Armbruster wrote:
>>>> Fei Li <f...@suse.com> writes:
>>>>
>>>>> Currently, when qemu_signal_init() fails it only returns a non-zero
>>>>> value but without propagating any Error. But its callers need a
>>>>> non-null err when runs error_report_err(err), or else 0->msg occurs.
>>>> The bug is in qemu_init_main_loop():
>>>>
>>>>       ret = qemu_signal_init();
>>>>       if (ret) {
>>>>           return ret;
>>>>       }
>>>>
>>>> Fails without setting an error, unlike the other failures.  Its callers
>>>> crash then.
>>> Thanks!
>> Consider working that into your commit message.
> Ok. I'd like to reword as follows:
> Currently in qemu_init_main_loop(), qemu_signal_init() fails without
> setting an error
> like the other failures. Its callers crash then when runs
> error_report_err(err).


Polishing the English a bit:

  When qemu_signal_init() fails in qemu_init_main_loop(), we return
  without setting an error.  Its callers crash then when they try to
  report the error with error_report_err().

>>
>>>>> To avoid such segmentation fault, add a new Error parameter to make
>>>>> the call trace to propagate the err to the final caller.
>>>>>
>>>>> Signed-off-by: Fei Li <f...@suse.com>
>>>>> Reviewed-by: Fam Zheng <f...@redhat.com>
>>>>> ---
>>>>>    include/qemu/osdep.h | 2 +-
>>>>>    util/compatfd.c      | 9 ++++++---
>>>>>    util/main-loop.c     | 9 ++++-----
>>>>>    3 files changed, 11 insertions(+), 9 deletions(-)
>>>>>
>>>>> diff --git a/include/qemu/osdep.h b/include/qemu/osdep.h
>>>>> index 4f8559e550..f1f56763a0 100644
>>>>> --- a/include/qemu/osdep.h
>>>>> +++ b/include/qemu/osdep.h
>>>>> @@ -421,7 +421,7 @@ struct qemu_signalfd_siginfo {
>>>>>                                 additional fields in the future) */
>>>>>    };
>>>>>    -int qemu_signalfd(const sigset_t *mask);
>>>>> +int qemu_signalfd(const sigset_t *mask, Error **errp);
>>>>>    void sigaction_invoke(struct sigaction *action,
>>>>>                          struct qemu_signalfd_siginfo *info);
>>>>>    #endif
>>>>> diff --git a/util/compatfd.c b/util/compatfd.c
>>>>> index 980bd33e52..d3ed890405 100644
>>>>> --- a/util/compatfd.c
>>>>> +++ b/util/compatfd.c
>>>>> @@ -16,6 +16,7 @@
>>>>>    #include "qemu/osdep.h"
>>>>>    #include "qemu-common.h"
>>>>>    #include "qemu/thread.h"
>>>>> +#include "qapi/error.h"
>>>>>      #include <sys/syscall.h>
>>>>>    @@ -65,7 +66,7 @@ static void *sigwait_compat(void *opaque)
>>>>>        }
>>>>>    }
>>>>>    -static int qemu_signalfd_compat(const sigset_t *mask)
>>>>> +static int qemu_signalfd_compat(const sigset_t *mask, Error **errp)
>>>>>    {
>>>>>        struct sigfd_compat_info *info;
>>>>>        QemuThread thread;
>>>>> @@ -73,11 +74,13 @@ static int qemu_signalfd_compat(const sigset_t *mask)
>>>>>          info = malloc(sizeof(*info));
>>>>>        if (info == NULL) {
>>>>> +        error_setg(errp, "Failed to allocate signalfd memory");
>>>>>            errno = ENOMEM;
>>>>>            return -1;
>>>>>        }
>>>>>          if (pipe(fds) == -1) {
>>>>> +        error_setg(errp, "Failed to create signalfd pipe");
>>>>>            free(info);
>>>>>            return -1;
>>>>>        }
>>>>> @@ -94,7 +97,7 @@ static int qemu_signalfd_compat(const sigset_t *mask)
>>>>>        return fds[0];
>>>>>    }
>>>>>    -int qemu_signalfd(const sigset_t *mask)
>>>>> +int qemu_signalfd(const sigset_t *mask, Error **errp)
>>>>>    {
>>>>>    #if defined(CONFIG_SIGNALFD)
>>>>>        int ret;
>>>>> @@ -106,5 +109,5 @@ int qemu_signalfd(const sigset_t *mask)
>>>>>        }
>>>>>    #endif
>>>>>    -    return qemu_signalfd_compat(mask);
>>>>> +    return qemu_signalfd_compat(mask, errp);
>>>>>    }
>>>> I think this takes the Error conversion too far.
>>>>
>>>> qemu_signalfd() is like the signalfd() system call, only portable, and
>>>> setting FD_CLOEXEC.  In particular, it reports failure just like a
>>>> system call: it sets errno and returns -1.  I'd prefer to keep it that
>>>> way.  Instead...
>>>>
>>>>> diff --git a/util/main-loop.c b/util/main-loop.c
>>>>> index affe0403c5..9671b6d226 100644
>>>>> --- a/util/main-loop.c
>>>>> +++ b/util/main-loop.c
>>>>> @@ -71,7 +71,7 @@ static void sigfd_handler(void *opaque)
>>>>>        }
>>>>>    }
>>>>>    -static int qemu_signal_init(void)
>>>>> +static int qemu_signal_init(Error **errp)
>>>>>    {
>>>>>        int sigfd;
>>>>>        sigset_t set;
>>>>> @@ -94,9 +94,8 @@ static int qemu_signal_init(void)
>>>>>        pthread_sigmask(SIG_BLOCK, &set, NULL);
>>>>>          sigdelset(&set, SIG_IPI);
>>>>> -    sigfd = qemu_signalfd(&set);
>>>>> +    sigfd = qemu_signalfd(&set, errp);
>>>>>        if (sigfd == -1) {
>>>>> -        fprintf(stderr, "failed to create signalfd\n");
>>>>>            return -errno;
>>>>>        }
>>>>>    
>>>> ... change this function so:
>>>>
>>>>          pthread_sigmask(SIG_BLOCK, &set, NULL);
>>>>              sigdelset(&set, SIG_IPI);
>>>>          sigfd = qemu_signalfd(&set);
>>>>          if (sigfd == -1) {
>>>>     -        fprintf(stderr, "failed to create signalfd\n");
>>>>     +        error_setg_errno(errp, errno, "failed to create signalfd");
>>>>              return -errno;
>>>>          }
>>>>
>>>> Does this make sense?
>>> Yes, it looks more concise if we only have this patch, but triggers
>>> one errno-set
>>> issue after applying patch 7/7, which adds a Error **errp parameter for
>>> qemu_thread_create() to let its callers handle the error instead of
>>> exit() directly
>>> to add the robustness.
>> I guess you're talking about the qemu_thread_create() in
>> qemu_signalfd_compat().  Correct?
> Yes.
>>
>>> For the patch series' current implementation, the  modified
>>> qemu_thread_create()
>>> in 7/7 patch returns a Boolean value to indicate whether it succeeds
>>> and set the
>>> error reason into the passed errp, and did not set the errno. Actually
>>> another
>>> similar errno-set issue has been talked in last patch. :)
>>> If we set the errno in future qemu_thread_create(), we need to
>>> distinguish the Linux
>>> and Windows implementation. For Linux, we can use error_setg_errno()
>>> to set errno.
>>> But for Windows, I am not sure if we can use
>>>
>>> "errno = GetLastError()"
>> No, that won't work.
>>
>>> to set errno, as this seems a little weird. Do you have any idea about this?
>>>
>>> BTW, if we have a decent errno-set way for Windows, I will adopt your above
>>> proposal for this patch.
>> According to MS docs, _beginthreadex() sets errno on failure:
>>
>>      If successful, each of these functions returns a handle to the newly
>>      created thread; however, if the newly created thread exits too
>>      quickly, _beginthread might not return a valid handle. (See the
>>      discussion in the Remarks section.) On an error, _beginthread
>>      returns -1L, and errno is set to EAGAIN if there are too many
>>      threads, to EINVAL if the argument is invalid or the stack size is
>>      incorrect, or to EACCES if there are insufficient resources (such as
>>      memory). On an error, _beginthreadex returns 0, and errno and
>>      _doserrno are set.
>>
>> https://docs.microsoft.com/cpp/c-runtime-library/reference/beginthread-beginthreadex
>>
>> Looks like the current code's use of GetLastError() after
>> _beginthreadex() failure is wrong.
>>
>> Fix that, and both qemu_thread_create() implementations set errno on
>> failure, which in turn lets you make qemu_signalfd_compat() and thus
>> qemu_signalfd() behave sanely regardless of which qemu_thread_create()
>> implementation they use below the hood.
> Thanks for the detail explanation. :)
> To fix that, how about replacing the "GetLastError()" with the returned
> value "hThread" (actually returns 0)? I mean
>    ...
>     hThread = (HANDLE) _beginthreadex(NULL, 0, win32_start_routine,
>                                       data, 0, &thread->tid);
>     if (!hThread) {
>         if (data->mode != QEMU_THREAD_DETACHED) {
>             DeleteCriticalSection(&data->cs);
>         }
>         error_setg_win32(errp, hThread,
>                          "failed to create win32_start_routine");
>         g_free(data);
>         return false;
>     }

No.  On failure, _beginthreadex() returns *zero*, not an error code.  It
also sets errno.  That's the error code you need to use:

    hThread = (HANDLE) _beginthreadex(NULL, 0, win32_start_routine,
                                      data, 0, &thread->tid);
    if (!hThread) {
        error_setg_errno(errp, errno, "can't create thread");
    }

Except I really wouldn't convert qemu_thread_create() to Error!  I'd
make it return zero on success, a negative errno code on failure, and
leave the Error business to its callers.  Basically, replace
error_exit(err, ...) by return err.

The caller qemu_signalfd_compat() can then do

    ret = qemu_thread_create(&thread, "signalfd_compat",
                             sigwait_compat, info, QEMU_THREAD_DETACHED);
    if (ret < 0) {
        errno = ret;
        return -1;
    }

A caller that already has an Error **errp parameter could do

    ret = qemu_thread_create(...);
    if (ret < 0) {
        error_setg_errno(errp, -ret, "<error message goes here>");
    }

Callers that want to continue aborting on failure simply do

    ret = qemu_thread_create(...);
    assert(ret >= 0);

If that turns out to be too much of a bother, you could create a
convenience wrapper for it:

    void qemu_thread_create_nofail(QemuThread *thread, const char *name,
                                   void *(*start_routine)(void*),
                                   void *arg, int mode)
    {
        int err = qemu_thread_create(thread, name, start_routine, arg, mode);
        if (err) {
            error_exit(err, __func__);
        }
    }

Re: [Qemu-devel] [PATCH RFC v5 1/7] Fix segmentation fault when qemu_signal_init fails

Reply via email to