Re: [Qemu-devel] [PULL 22/24] target-arm: ensure all cross vCPUs TLB flushes complete

Dmitry Osipenko Mon, 18 Sep 2017 05:24:22 -0700

On 18.09.2017 13:10, Alex Bennée wrote:
> 
> Dmitry Osipenko <dig...@gmail.com> writes:
> 
>> On 17.09.2017 16:22, Alex Bennée wrote:
>>>
>>> Dmitry Osipenko <dig...@gmail.com> writes:
>>>
>>>> On 24.02.2017 14:21, Alex Bennée wrote:
>>>>> Previously flushes on other vCPUs would only get serviced when they
>>>>> exited their TranslationBlocks. While this isn't overly problematic it
>>>>> violates the semantics of TLB flush from the point of view of source
>>>>> vCPU.
>>>>>
>>>>> To solve this we call the cputlb *_all_cpus_synced() functions to do
>>>>> the flushes which ensures all flushes are completed by the time the
>>>>> vCPU next schedules its own work. As the TLB instructions are modelled
>>>>> as CP writes the TB ends at this point meaning cpu->exit_request will
>>>>> be checked before the next instruction is executed.
>>>>>
>>>>> Deferring the work until the architectural sync point is a possible
>>>>> future optimisation.
>>>>>
>>>>> Signed-off-by: Alex Bennée <alex.ben...@linaro.org>
>>>>> Reviewed-by: Richard Henderson <r...@twiddle.net>
>>>>> Reviewed-by: Peter Maydell <peter.mayd...@linaro.org>
>>>>> ---
>>>>>  target/arm/helper.c | 165 
>>>>> ++++++++++++++++++++++------------------------------
>>>>>  1 file changed, 69 insertions(+), 96 deletions(-)
>>>>>
>>>>
>>>> Hello,
>>>>
>>>> I have an issue with Linux kernel stopping to boot on a SMP 32bit ARM 
>>>> (haven't
>>>> checked 64bit) in a single-threaded TCG mode. Kernel reaches point where it
>>>> should mount rootfs over NFS and vCPUs stop. This issue is reproducible 
>>>> with any
>>>> 32bit ARM machine type. Kernel boots fine with a MTTCG accel, only
>>>> single-threaded TCG is affected. Git bisection lead to this patch, any
>>>> ideas?
>>>
>>> It shouldn't cause a problem but can you obtain a backtrace of the
>>> system when hung?
>>>
>>
>> Actually, it looks like TCG enters infinite loop. Do you mean backtrace of 
>> QEMU
>> by 'backtrace of the system'? If so, here it is:
>>
>> Thread 4 (Thread 0x7ffa37f10700 (LWP 20716)):
>>
>> #0  0x00007ffa601888bd in poll () at ../sysdeps/unix/syscall-template.S:84
>>
>> #1  0x00007ffa5e3aa561 in poll (__timeout=-1, __nfds=2, 
>> __fds=0x7ffa30006dc0) at
>> /usr/include/bits/poll2.h:46
>> #2  poll_func (ufds=0x7ffa30006dc0, nfds=2, timeout=-1, 
>> userdata=0x557bd603eae0)
>> at
>> /var/tmp/portage/media-sound/pulseaudio-10.0/work/pulseaudio-10.0/src/pulse/thread-mainloop.c:69
>> #3  0x00007ffa5e39bbb1 in pa_mainloop_poll (m=m@entry=0x557bd60401f0) at
>> /var/tmp/portage/media-sound/pulseaudio-10.0/work/pulseaudio-10.0/src/pulse/mainloop.c:844
>> #4  0x00007ffa5e39c24e in pa_mainloop_iterate (m=0x557bd60401f0,
>> block=<optimized out>, retval=0x0) at
>> /var/tmp/portage/media-sound/pulseaudio-10.0/work/pulseaudio-10.0/src/pulse/mainloop.c:926
>> #5  0x00007ffa5e39c300 in pa_mainloop_run (m=0x557bd60401f0,
>> retval=retval@entry=0x0) at
>> /var/tmp/portage/media-sound/pulseaudio-10.0/work/pulseaudio-10.0/src/pulse/mainloop.c:944
>>
>> #6  0x00007ffa5e3aa4a9 in thread (userdata=0x557bd60400f0) at
>> /var/tmp/portage/media-sound/pulseaudio-10.0/work/pulseaudio-10.0/src/pulse/thread-mainloop.c:100
>>
>> #7  0x00007ffa599eea38 in internal_thread_func (userdata=0x557bd603e090) at
>> /var/tmp/portage/media-sound/pulseaudio-10.0/work/pulseaudio-10.0/src/pulsecore/thread-posix.c:81
>>
>> #8  0x00007ffa60453657 in start_thread (arg=0x7ffa37f10700) at
>> pthread_create.c:456
>>
>> #9  0x00007ffa60193c5f in clone () at
>> ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
>>
>>
>>
>>
>>
>> Thread 3 (Thread 0x7ffa4adff700 (LWP 20715)):
>>
>>
>> #0  0x00007ffa53e51caf in code_gen_buffer ()
>>
> 
> Well it's not locked up in servicing any flush tasks as it's executing
> code. Maybe the guest code is spinning on something?
>


Indeed, I should have used 'exec' instead of 'in_asm'.

> In the monitor:
> 
>   info registers
> 
> Will show you where things are, see if the ip is moving each time. Also
> you can do a disassemble dump from there to see what code it is stuck
> on.
> 

I've attached with GDB to QEMU to see where it got stuck. Turned out it is
caused by CONFIG_STRICT_KERNEL_RWX=y of the Linux kernel. Upon boot completion
kernel changes memory permissions and that changing is executed on a dedicated
CPU, while other CPUs are 'stopped' in a busy loop.

This patch just introduced a noticeable performance regression for a
single-threaded TCG, which is probably fine since MTTCG is the default now.
Thank you very much for the suggestions and all your work on MTTCG!

-- 
Dmitry

Re: [Qemu-devel] [PULL 22/24] target-arm: ensure all cross vCPUs TLB flushes complete

Reply via email to