Am 28.11.2014 um 15:45 hat Markus Armbruster geschrieben: > Paolo Bonzini <pbonz...@redhat.com> writes: > > > ELF thread local storage is about 10% faster on tests/test-coroutine's > > perf/cost test. The timing on my machine is 160ns per iteration with > > pthread TLS, 145 with ELF TLS. > > > > Based on a patch by Kevin Wolf and Peter Lieven, but redone to follow > > the model of coroutine-win32.c (including the important "noinline" > > attribute!!!). > > > > Platforms without thread-local storage (OpenBSD probably?) will need > > a new-enough GCC for this to compile, in order to use the same emutls > > support that Windows already relies on. > [...] > > @@ -193,15 +155,22 @@ void qemu_coroutine_delete(Coroutine *co_) > > g_free(co); > > } > > > > +/* This function is marked noinline to prevent GCC from inlining it > > + * into coroutine_trampoline(). If we allow it to do that then it > > + * hoists the code to get the address of the TLS variable "current" > > + * out of the while() loop. This is an invalid transformation because > > + * the SwitchToFiber() call may be called when running thread A but > > + * return in thread B, and so we might be in a different thread > > + * context each time round the loop. > > + */ > > CoroutineAction qemu_coroutine_switch(Coroutine *from_, Coroutine *to_, > > CoroutineAction action) > > Err, did you forget the actual __attribute__((noinline))?
The comment needs updating, too. There's no SwitchToFiber() in the ucontext implementation. Kevin