Re: [Qemu-devel] [PATCH v5 11/13] cpu-exec-common: Introduce async_safe_run_on_cpu()

Alex Bennée Wed, 03 Aug 2016 23:46:57 -0700

Emilio G. Cota <c...@braap.org> writes:

> On Wed, Aug 03, 2016 at 22:02:04 +0100, Alex Bennée wrote:
>> Emilio G. Cota <c...@braap.org> writes:
>>
>> > On Tue, Aug 02, 2016 at 18:27:42 +0100, Alex Bennée wrote:
> (snip)
>> >> +void wait_safe_cpu_work(void)
>> >> +{
>> >> +    while (can_wait_for_safe() && atomic_mb_read(&safe_work_pending) > 
>> >> 0) {
>> >
>> > The atomic here is puzzling, see below.
>> >
>> >> +        /*
>> >> +         * If there is pending safe work and no pending threads we
>> >> +         * need to signal another thread to start its work.
>> >> +         */
>> >> +        if (tcg_pending_threads == 0) {
>> >> +            qemu_cond_signal(&qemu_exclusive_cond);
>> >> +        }
>> >> +        qemu_cond_wait(&qemu_safe_work_cond, qemu_get_cpu_work_mutex());
>> >> +    }
>> >> +}
>> >>
>> >>  static void queue_work_on_cpu(CPUState *cpu, struct qemu_work_item *wi)
>> >>  {
>> >> @@ -91,9 +121,18 @@ static void queue_work_on_cpu(CPUState *cpu, struct 
>> >> qemu_work_item *wi)
>> >>      cpu->queued_work_last = wi;
>> >>      wi->next = NULL;
>> >>      wi->done = false;
>> >> +    if (wi->safe) {
>> >> +        atomic_inc(&safe_work_pending);
>> >> +    }
>> >
>> > This doesn't seem right. Operating on the condvar's shared 'state' variable
>> > should always be done with the condvar's mutex held. Otherwise, there's
>> > no guarantee that sleepers will always see a consistent state when they're
>> > woken up, which can easily lead to deadlock.
>>
>> How so? Surely the barriers around the atomic accesses and the implicit
>> barriers of the mutexes ensure it is?
>
> Barriers guarantee that accesses will be perceived in the right order.
> However, they do not determine *when* those accesses will be seen by
> other CPUs. For that we need stronger primitives, i.e. atomics like
> the ones embedded in locks. Otherwise we might end up with races that
> are very hard to debug.


But that's what we have, atomics incs followed by a kick
(cpu_exit/signal conds) to wake up threads. FWIW I did fix a problem
with the MTTCG logic last night to do with sleeping:

    
https://github.com/stsquad/qemu/commit/2f4f3ed149cfe87794a3bd4adfccf04b4cd81873

>
>> > I suspect this is what caused the deadlock you saw in the last iteration
>> > of the series.
>> >
>> > An additional requirement is the fact that new CPUs can come anytime in
>> > user-mode (imagine we're flushing the TB while a new pthread was just
>> > spawned). This is easily triggered by greatly reducing the size of the
>> > translation buffer, and spawning dozens of threads.
>>
>> I don't suppose you have a test case written up for this already?
>>
>> My kvm-unit-tests are fairly extensive for SoftMMU mode but for
>> user-mode I was only using pigz with the TB buffer scaled down.
>> Obviously I need to expand the user-mode testing.
>
> A tiny TB buffer (redefining the constants in translate-all.c), plus
> a program running under linux-user that spawns many threads that do actual
> work is a good test.

Yeah this is what the pigz test does. I made the buffer very small and
had several clashing flush events but it was all stable.

>
>> > A possible fix is to sched safe work after exiting the CPU loop, i.e.
>> > with qemu_get_cpu_work_mutex held. I tried this on v4 of this patchset
>> > and doesn't scale very well on 64 cores (too much contention
>> > on tb_lock), although at least it doesn't deadlock.
>>
>> Where exactly? Surely tb_lock contention shouldn't be a problem as it is
>> only held for generation and patching now?
>
> Booting up 64 cores on x86_64 can show contention on a 64-core host,
> since CPU kicks are frequent. Do you have this v5 + mttcg + cmpxchg in a
> branch so that I can test?

You can have async-work-v5 + base-patches-v4-wip:

  
https://github.com/stsquad/qemu/commits/mttcg/base-patches-v4-04082016-for-emilio

Last time I experimented with merging your cmpxchg work it went in
without any conflicts so hopefully that is the same now.

This tree currently runs all of the kvm-unit-tests for ARMv7 and v8 in
-accel tcg,thread=single mode and runs the tcg and tlbflush test groups
in -accel tcg,thread=multi mode.

It looks like I have some silly compile errors to fix for Travis to be
happy though.

>> > An alternative is to have a separate lock for safe work, and check for
>> > safe work once there are no other locks held; a good place to do this is
>> > at the beginning of cpu_loop_exec. This scales better, and I'd argue
>> > it's simpler. In fact, I posted a patch that does this about a year
>> > ago (!):
>> >   https://lists.nongnu.org/archive/html/qemu-devel/2015-08/msg02576.html
>>
>> I'll have another look at this. One thing I prefer about this series is
>> it keeps all the work mechanisms together. I think that's worth striving
>> for if we can.
>
> Sure. I don't think that old patchset has much value apart from raising
> some issues that aren't mentioned in this series.
>
> By the way before even considering the merge of this patchset, I think
> we should look at merging first the cmpxchg work (at least for x86)
> so that we can thoroughly test this set at least with linux-user. Otherwise
> we'll see errors/segfaults and we won't know what caused them.

Yes I reckon the cmpxchg and barrier work should get merged first as it
is both testable and worthwhile for the existing linux-user case. If we
are happy async-work is fit for purpose in SoftMMU mode as well I would
expect that to get merged ahead of the main base enabling set.

In fact the base patches have pulled a bunch of the cputlb patches from
the ARM enabling series I think we are pretty close to base-patches-v4 +
any functioning atomic fix = support for all platforms ;-)

There are a few wrinkles that need to be checked out for each platform
though so I still expect we'll be enabling combos as we go. This mainly
has to do with things like booting up cpus in a MTTCG safe way.

--
Alex Bennée

Re: [Qemu-devel] [PATCH v5 11/13] cpu-exec-common: Introduce async_safe_run_on_cpu()

Reply via email to