Hi Julien, > -----Original Message----- > From: Julien Grall <jul...@xen.org> > Sent: 2020年11月26日 18:55 > To: Wei Chen <wei.c...@arm.com>; Stefano Stabellini <sstabell...@kernel.org> > Cc: Penny Zheng <penny.zh...@arm.com>; xen-devel@lists.xenproject.org; > Andre Przywara <andre.przyw...@arm.com>; Bertrand Marquis > <bertrand.marq...@arm.com>; Kaly Xin <kaly....@arm.com>; nd > <n...@arm.com> > Subject: Re: [PATCH] xen/arm: Add Cortex-A73 erratum 858921 workaround > > Hi Wei, > > Your e-mail font seems to be different to the usual plain text one. Are > you sending the e-mail using HTML by any chance? >
It's strange, I always use the plain text. > On 26/11/2020 02:07, Wei Chen wrote: > > Hi Stefano, > > > >> -----Original Message----- > >> From: Stefano Stabellini <sstabell...@kernel.org> > >> Sent: 2020��11��26�� 8:00 > >> To: Wei Chen <wei.c...@arm.com> > >> Cc: Julien Grall <jul...@xen.org>; Penny Zheng <penny.zh...@arm.com>; > xen- > >> de...@lists.xenproject.org; sstabell...@kernel.org; Andre Przywara > >> <andre.przyw...@arm.com>; Bertrand Marquis > <bertrand.marq...@arm.com>; > >> Kaly Xin <kaly....@arm.com>; nd <n...@arm.com> > >> Subject: RE: [PATCH] xen/arm: Add Cortex-A73 erratum 858921 workaround > >> > >> Resuming this old thread. > >> > >> On Fri, 13 Nov 2020, Wei Chen wrote: > >>>> Hi, > >>>> > >>>> On 09/11/2020 08:21, Penny Zheng wrote: > >>>>> CNTVCT_EL0 or CNTPCT_EL0 counter read in Cortex-A73 (all versions) > >>>>> might return a wrong value when the counter crosses a 32bit boundary. > >>>>> > >>>>> Until now, there is no case for Xen itself to access CNTVCT_EL0, > >>>>> and it also should be the Guest OS's responsibility to deal with > >>>>> this part. > >>>>> > >>>>> But for CNTPCT, there exists several cases in Xen involving reading > >>>>> CNTPCT, so a possible workaround is that performing the read twice, > >>>>> and to return one or the other depending on whether a transition has > >>>>> taken place. > >>>>> > >>>>> Signed-off-by: Penny Zheng <penny.zh...@arm.com> > >>>> > >>>> Acked-by: Julien Grall <jgr...@amazon.com> > >>>> > >>>> On a related topic, do we need a fix similar to Linux commit > >>>> 75a19a0202db "arm64: arch_timer: Ensure counter register reads occur > >>>> with seqlock held"? > >>>> > >>> > >>> I take a look at this Linux commit, it seems to prevent the seq-lock to be > >>> speculated. Using an enforce ordering instead of ISB after the read > >>> counter > >>> operation seems to be for performance reasons. > >>> > >>> I have found that you had placed an ISB before read counter in get_cycles > >>> to prevent counter value to be speculated. But you haven't placed the > second > >>> ISB after reading. Is it because we haven't used the get_cycles in seq > >>> lock > >>> critical context (Maybe I didn't find the right place)? So should we need > >>> to > fix it > >>> now, or you prefer to fix it now for future usage? > >> > >> Looking at the call sites, it doesn't look like we need any ISB after > >> get_cycles as it is not used in any critical context. There is also a > >> data dependency with the value returned by it. > > I am assuming you looked at all the users of NOW(). Is that right? > > >> > >> So I am thinking we don't need any fix. At most we need an in-code comment? > > > > I agree with you to add an in-code comment. It will remind us in future when > we > > use the get_cycles in critical context. Adding it now will probably only > > lead to > > meaningless performance degradation. > > I read this as there would be no perfomance impact if we add the > ordering it now. Did you intend to say? Sorry about my English. I intended to say "Adding it now may introduce some performance cost. And this performance cost may be not worth. Because Xen may never use it in a similar scenario " > > > Because Xen may never use it in a similar > > scenario. > > Right, there are two potentials approach here: > - Wait until there are a user > * Pros: Doesn't impact performance > * Cons: We rely on users/reviewers to catch any misuse > - Harden the code > * Pros: Less risk to introduce a bug inadvertently > * Cons: May impact the performance > > In general, I prefer that the code is hardened by default if the > performance impact is limited. This may save us hours of > debugging/reproducing bug. > From a preventive point of view, you're right. > In addition, AFAICT, the x86 version of get_cycles() is already able to > provide that ordering. So there are chances that code may rely on it. > > While I don't necessarily agree to add barriers everywhere by default > (this may have big impact on the platform). I think it is better to have > an accurate number of cycles. > As x86 had done it, I think it’s ok to do it for Arm. This will keep a function behaves the same on different architectures. Thanks, Wei Chen > Cheers, > > -- > Julien Grall