Hi Peter, Am 16.06.2015 um 12:59 schrieb Peter Maydell: > On 16 June 2015 at 11:33, Peter Maydell <peter.mayd...@linaro.org> wrote: >> Pressing a key does not unwedge the test case for me. > > Looking at the logs, this seems to be expected given what > the guest code does with CPU #1: (the below is edited logs, > created with a hacky patch I have that annotates the debug > logs with CPU numbers): > > CPU #1: Trace 0x7f2d67afa000 [80000100] _start > # we start > CPU #1: Trace 0x7f2d67afc060 [8000041c] main_cpu1 > # we correctly figured out we're CPU 1 > CPU #1: Trace 0x7f2d67afc220 [80000448] main_cpu1 > # we took the branch to 80000448 > CPU #1: Trace 0x7f2d67afc220 [80000448] main_cpu1 > # 8000448 is a branch-to-self, so here we stay > > CPU #1 never bothered to enable its GICC cpu interface, > so it will never receive interrupts and will never get > out of this tight loop.
Yes. CPU#1 is stuck in the initial spinlock which lacks WFE. > We get here because CPU #1 has got through main_cpu1 > to the point of testing your 'release' variable before > CPU #0 has got through main_cpu0 far enough to set it > to 1, so it still has the zero in it that it has on > system startup. If scheduling happened to mean that > CPU #0 ran further through main_cpu0 before CPU #1 > ran, we wouldn't end up in this situation -- you have a > race condition, as I suggested. > > The log shows we're sat with CPU#0 fruitlessly looping > on a variable in memory, and CPU#1 in this endless loop. I know that the startup has a racy because I removed too much code from the original project. But the startup is not my problem, it's the later parts. I added the WFE to the initial lock. Here are two new tests, both are now 3178 bytes in size: http://www.cs.hs-rm.de/~zuepke/qemu/ipi.elf http://www.cs.hs-rm.de/~zuepke/qemu/ipi_yield.elf Both start on my machine. The IPI ping-pong starts after the first timer interrupt after 1s. The problem is that IPIs are delivered only once a second after the timer interrupts QEMU's main loop. > PS: QEMU doesn't care, but your binary seems to be entirely > devoid of barrier instructions, which is likely to cause > you problems on real hardware. > > thanks > -- PMM Yes, I trimmed down my code to the bare minimum to handle IPIs on QEMU only. It lacks barriers, cache handling and has bogus baudrate settings. Something else: Existing ARM CPU so far do not use hyper-threading, but have real phyical cores. In contrast, QEMU is an extreme coarse-grained hyper-threading architectures, so existing legacy code that was written with physical cores in mind will trigger timing bugs in synchronization primitives then, especially code originally written for ARM11 MPCore like mine, which lacks WFE/SEV. If we consider QEMU as a platform to run legacy code, doesn't it make sense to address these issues? Best regards Alex