Mark Burton <mark.bur...@greensocs.com> writes: > Paolo, Alex, Alexander, > > Talking to Fred after the call about ways of avoiding the ‘stop the world’ > (or rather ‘sync the world’) - we already discussed this on this thread. > One thing that would be very helpful would be some test cases around > this. We could then use Fred’s code to check some of the possible > solutions out….
Yeah we certainly could do with some. I'm currently investigating the memory barriers but TLB flushing might be easier to write at first. > > I’m not sure if there is wiggle room in Peter’s statement below. Can > the TLB operation be completed on one core, but not ‘seen’ by other > cores until they hit an exit…..? I suspect they can - assuming no other guest synchronisation primitive was in play who's to say the other cores weren't at their eventual PC already. However I suspect the key thing is the first core doesn't restart until all the other cores have caught up with their flush operations. > > Cheers > > Mark. > > >> On 26 Jun 2015, at 18:30, Frederic Konrad <fred.kon...@greensocs.com> wrote: >> >> On 26/06/2015 18:08, Peter Maydell wrote: >>> On 26 June 2015 at 17:01, Paolo Bonzini <pbonz...@redhat.com> wrote: >>>> On 26/06/2015 17:54, Frederic Konrad wrote: >>>>> So what happen is: >>>>> An arm instruction want to clear tlb of all VCPUs eg: IS version of >>>>> TLBIALL. >>>>> The VCPU which execute the TLBIALL_IS can't flush tlb of other VCPU. >>>>> It will just ask all VCPU thread to exit and to do tlb_flush hence the >>>>> async_work. >>>>> >>>>> Maybe the big issue might be memory barrier instruction here which I >>>>> didn't >>>>> checked. >>>> Yeah, ISTR that in some cases you have to wait for other CPUs to >>>> invalidate the TLB before proceeding. Maybe it's only when you have a >>>> dmb instruction, but it's probably simpler for QEMU to always do it >>>> synchronously. >>> Yeah, the ARM architectural requirement here is that the TLB >>> operation is complete after a DSB instruction executes. (True for >>> any TLB op, not just the all-CPUs ones). NB that we also call >>> tlb_flush() from target-arm/ code for some things like "we just >>> updated a system register"; some of those have "must take effect >>> immediately" semantics. >>> >>> In any case, for generic code we have to also consider the >>> semantics of non-ARM guests... >>> >>> thanks >>> -- PMM >> Yes this is not the case as I implemented it. >> >> The rest of the TB will be executed before the tlb_flush work really happen. >> The old version did this, was slow and was a mess (if two VCPUs want to >> tlb_flush >> at the same time and an other tlb_flush_page.. it becomes tricky..) >> >> I think it's not really terrible if the other VCPU execute some stuff before >> doing the >> tlb_flush.? So the solution would be only to cut the TranslationBlock after >> instruction >> which require a tlb_flush? >> >> Thanks, >> Fred >> -- Alex Bennée