On Mon, Nov 30, 2020 at 10:40:49PM +0100, Alexander Graf wrote: > Hi Peter, > > On 30.11.20 22:08, Peter Collingbourne wrote: > > On Mon, Nov 30, 2020 at 12:56 PM Frank Yang <l...@google.com> wrote: > > > > > > > > > On Mon, Nov 30, 2020 at 12:34 PM Alexander Graf <ag...@csgraf.de> wrote: > > > > Hi Frank, > > > > > > > > Thanks for the update :). Your previous email nudged me into the right > > > > direction. I previously had implemented WFI through the internal timer > > > > framework which performed way worse. > > > Cool, glad it's helping. Also, Peter found out that the main thing > > > keeping us from just using cntpct_el0 on the host directly and compare > > > with cval is that if we sleep, cval is going to be much < cntpct_el0 by > > > the sleep time. If we can get either the architecture or macos to read > > > out the sleep time then we might be able to not have to use a poll > > > interval either! > > > > Along the way, I stumbled over a few issues though. For starters, the > > > > signal mask for SIG_IPI was not set correctly, so while pselect() would > > > > exit, the signal would never get delivered to the thread! For a fix, > > > > check out > > > > > > > > > > > > https://patchew.org/QEMU/20201130030723.78326-1-ag...@csgraf.de/20201130030723.78326-4-ag...@csgraf.de/ > > > > > > > Thanks, we'll take a look :) > > > > > > > Please also have a look at my latest stab at WFI emulation. It doesn't > > > > handle WFE (that's only relevant in overcommitted scenarios). But it > > > > does handle WFI and even does something similar to hlt polling, albeit > > > > not with an adaptive threshold. > > Sorry I'm not subscribed to qemu-devel (I'll subscribe in a bit) so > > I'll reply to your patch here. You have: > > > > + /* Set cpu->hvf->sleeping so that we get a > > SIG_IPI signal. */ > > + cpu->hvf->sleeping = true; > > + smp_mb(); > > + > > + /* Bail out if we received an IRQ meanwhile */ > > + if (cpu->thread_kicked || (cpu->interrupt_request & > > + (CPU_INTERRUPT_HARD | CPU_INTERRUPT_FIQ))) { > > + cpu->hvf->sleeping = false; > > + break; > > + } > > + > > + /* nanosleep returns on signal, so we wake up on kick. > > */ > > + nanosleep(ts, NULL); > > > > and then send the signal conditional on whether sleeping is true, but > > I think this is racy. If the signal is sent after sleeping is set to > > true but before entering nanosleep then I think it will be ignored and > > we will miss the wakeup. That's why in my implementation I block IPI > > on the CPU thread at startup and then use pselect to atomically > > unblock and begin sleeping. The signal is sent unconditionally so > > there's no need to worry about races between actually sleeping and the > > "we think we're sleeping" state. It may lead to an extra wakeup but > > that's better than missing it entirely. > > > Thanks a bunch for the comment! So the trick I was using here is to modify > the timespec from the kick function before sending the IPI signal. That way, > we know that either we are inside the sleep (where the signal wakes it up) > or we are outside the sleep (where timespec={} will make it return > immediately). > > The only race I can think of is if nanosleep does calculations based on the > timespec and we happen to send the signal right there and then. > > The problem with blocking IPIs is basically what Frank was describing > earlier: How do you unset the IPI signal pending status? If the signal is > never delivered, how can pselect differentiate "signal from last time is > still pending" from "new signal because I got an IPI"? > >
Hi Alex, There was a patch for x86 HVF that implements CPU kick and it wasn't merged (mostly because of my lazyness). It has some changes like you introduced in the series and VMX-specific handling of preemption timer to gurantee interrupt delivery without kick loss: https://patchwork.kernel.org/project/qemu-devel/patch/20200729124832.79375-1-r.bolsha...@yadro.com/ I wonder if it'd possible to have common handling of kicks for both x86 and arm (given that arch-specific bits are wrapped)? Thanks, Roman