Re: [Qemu-devel] [PATCH for-4.0 v9 16/16] qemu_thread_join: fix segmentation fault

Markus Armbruster Wed, 09 Jan 2019 07:26:53 -0800

Fei Li <lifei1...@126.com> writes:

> 在 2019/1/9 上午1:29, Markus Armbruster 写道:
>> fei <lifei1...@126.com> writes:
>>
>>>> 在 2019年1月8日，01:55，Markus Armbruster <arm...@redhat.com> 写道：
>>>>
>>>> Fei Li <f...@suse.com> writes:
>>>>
>>>>> To avoid the segmentation fault in qemu_thread_join(), just directly
>>>>> return when the QemuThread *thread failed to be created in either
>>>>> qemu-thread-posix.c or qemu-thread-win32.c.
>>>>>
>>>>> Cc: Stefan Weil <s...@weilnetz.de>
>>>>> Signed-off-by: Fei Li <f...@suse.com>
>>>>> Reviewed-by: Fam Zheng <f...@redhat.com>
>>>>> ---
>>>>> util/qemu-thread-posix.c | 3 +++
>>>>> util/qemu-thread-win32.c | 2 +-
>>>>> 2 files changed, 4 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/util/qemu-thread-posix.c b/util/qemu-thread-posix.c
>>>>> index 39834b0551..3548935dac 100644
>>>>> --- a/util/qemu-thread-posix.c
>>>>> +++ b/util/qemu-thread-posix.c
>>>>> @@ -571,6 +571,9 @@ void *qemu_thread_join(QemuThread *thread)
>>>>>      int err;
>>>>>      void *ret;
>>>>>
>>>>> +    if (!thread->thread) {
>>>>> +        return NULL;
>>>>> +    }
>>>> How can this happen?
>>> I think I have answered this earlier, please check the following link to 
>>> see whether it helps:
>>> http://lists.nongnu.org/archive/html/qemu-devel/2018-11/msg06554.html
>> Thanks for the pointer.  Unfortunately, I don't understand your
>> explanation.  You also wrote there "I will remove this patch in next
>> version"; looks like you've since changed your mind.
> Emm, issues left over from history.. The background is I was hurry to
> make those five
> Reviewed-by patches be merged, including this v9 16/16 patch but not
> the real
> qemu_thread_create() modification. But actually this patch is to fix
> the segmentation
> fault after we modified qemu_thread_create() related functions
> although it has got a
> Reviewed-by earlier. :) Thus to not make troube, I wrote the
> "remove..." sentence
> to separate it from those 5 Reviewed-by patches, and were plan to send
> only four patches.
> But later I got a message that these five patches are not that urgent
> to catch qemu v3.1,
> thus I joined the earlier 5 R-b patches into the later v8 & v9 to have
> a better review.
>
> Sorry for the trouble, I need to explain it without involving too much
> background..
>
> Back at the farm: in our current qemu code, some cleanups use a loop
> to join()
> the total number of threads if caller fails. This is not a problem
> until applying the
> qemu_thread_create() modification. E.g. when compress_threads_save_setup()
> fails while trying to create the last do_data_compress thread,
> segmentation fault
> will occur when join() is called (sadly there's not enough condition
> to filter this
> unsuccessful created thread) as this thread is actually not be created.
>
> Hope the above makes it clear. :)


Alright, let's have a look at compress_threads_save_setup():

    static int compress_threads_save_setup(void)
    {
        int i, thread_count;

        if (!migrate_use_compression()) {
            return 0;
        }
        thread_count = migrate_compress_threads();
        compress_threads = g_new0(QemuThread, thread_count);
        comp_param = g_new0(CompressParam, thread_count);
        qemu_cond_init(&comp_done_cond);
        qemu_mutex_init(&comp_done_lock);
        for (i = 0; i < thread_count; i++) {
            comp_param[i].originbuf = g_try_malloc(TARGET_PAGE_SIZE);
            if (!comp_param[i].originbuf) {
                goto exit;
            }

            if (deflateInit(&comp_param[i].stream,
                            migrate_compress_level()) != Z_OK) {
                g_free(comp_param[i].originbuf);
                goto exit;
            }

            /* comp_param[i].file is just used as a dummy buffer to save data,
             * set its ops to empty.
             */
            comp_param[i].file = qemu_fopen_ops(NULL, &empty_ops);
            comp_param[i].done = true;
            comp_param[i].quit = false;
            qemu_mutex_init(&comp_param[i].mutex);
            qemu_cond_init(&comp_param[i].cond);
            qemu_thread_create(compress_threads + i, "compress",
                               do_data_compress, comp_param + i,
                               QEMU_THREAD_JOINABLE);
        }
        return 0;

    exit:
        compress_threads_save_cleanup();
        return -1;
    }

At label exit, we have @i threads, all fully initialized.  That's an
invariant.

compress_threads_save_cleanup() finds the threads to clean up by
checking comp_param[i].file:

    static void compress_threads_save_cleanup(void)
    {
        int i, thread_count;

        if (!migrate_use_compression() || !comp_param) {
            return;
        }

        thread_count = migrate_compress_threads();
        for (i = 0; i < thread_count; i++) {
            /*
             * we use it as a indicator which shows if the thread is
             * properly init'd or not
             */
--->        if (!comp_param[i].file) {
--->            break;
--->        }

            qemu_mutex_lock(&comp_param[i].mutex);
            comp_param[i].quit = true;
            qemu_cond_signal(&comp_param[i].cond);
            qemu_mutex_unlock(&comp_param[i].mutex);

            qemu_thread_join(compress_threads + i);
            qemu_mutex_destroy(&comp_param[i].mutex);
            qemu_cond_destroy(&comp_param[i].cond);
            deflateEnd(&comp_param[i].stream);
            g_free(comp_param[i].originbuf);
            qemu_fclose(comp_param[i].file);
            comp_param[i].file = NULL;
        }
        qemu_mutex_destroy(&comp_done_lock);
        qemu_cond_destroy(&comp_done_cond);
        g_free(compress_threads);
        g_free(comp_param);
        compress_threads = NULL;
        comp_param = NULL;
    }

Due to the invariant, a comp_param[i] with a null .file doesn't need
*any* cleanup.

To maintain the invariant, compress_threads_save_setup() carefully
cleans up any partial initializations itself before a goto exit.  Since
the code is arranged smartly, the only such cleanup is the
g_free(comp_param[i].originbuf) before the second goto exit.

Your PATCH 13 adds a third goto exit, but neglects to clean up partial
initializations.  Breaks the invariant.

I see two sane solutions:

1. compress_threads_save_setup() carefully cleans up partial
   initializations itself.  compress_threads_save_cleanup() copes only
   with fully initialized comp_param[i].  This is how things work before
   your series.

2. compress_threads_save_cleanup() copes with partially initialized
   comp_param[i], i.e. does the right thing for each goto exit in
   compress_threads_save_setup().  compress_threads_save_setup() doesn't
   clean up partial initializations.

Your PATCH 13 together with the fixup PATCH 16 does

3. A confusing mix of the two.

Don't.

> Have a nice day
> Fei
>>
>> What exactly breaks if we omit this patch?  Assuming something does
>> break: imagine we did omit this patch, then forgot we ever saw it, and
>> now you've discovered the breakage.  Write us the bug report, complete
>> with reproducer.
>>
>> [...]

Re: [Qemu-devel] [PATCH for-4.0 v9 16/16] qemu_thread_join: fix segmentation fault

Reply via email to