Hi Julien,

> -----Original Message-----
> From: Julien Grall <jul...@xen.org>
> Sent: 2020年11月26日 18:55
> To: Wei Chen <wei.c...@arm.com>; Stefano Stabellini <sstabell...@kernel.org>
> Cc: Penny Zheng <penny.zh...@arm.com>; xen-devel@lists.xenproject.org;
> Andre Przywara <andre.przyw...@arm.com>; Bertrand Marquis
> <bertrand.marq...@arm.com>; Kaly Xin <kaly....@arm.com>; nd
> <n...@arm.com>
> Subject: Re: [PATCH] xen/arm: Add Cortex-A73 erratum 858921 workaround
> 
> Hi Wei,
> 
> Your e-mail font seems to be different to the usual plain text one. Are
> you sending the e-mail using HTML by any chance?
> 

It's strange, I always use the plain text.

> On 26/11/2020 02:07, Wei Chen wrote:
> > Hi Stefano,
> >
> >> -----Original Message-----
> >> From: Stefano Stabellini <sstabell...@kernel.org>
> >> Sent: 2020��11��26�� 8:00
> >> To: Wei Chen <wei.c...@arm.com>
> >> Cc: Julien Grall <jul...@xen.org>; Penny Zheng <penny.zh...@arm.com>;
> xen-
> >> de...@lists.xenproject.org; sstabell...@kernel.org; Andre Przywara
> >> <andre.przyw...@arm.com>; Bertrand Marquis
> <bertrand.marq...@arm.com>;
> >> Kaly Xin <kaly....@arm.com>; nd <n...@arm.com>
> >> Subject: RE: [PATCH] xen/arm: Add Cortex-A73 erratum 858921 workaround
> >>
> >> Resuming this old thread.
> >>
> >> On Fri, 13 Nov 2020, Wei Chen wrote:
> >>>> Hi,
> >>>>
> >>>> On 09/11/2020 08:21, Penny Zheng wrote:
> >>>>> CNTVCT_EL0 or CNTPCT_EL0 counter read in Cortex-A73 (all versions)
> >>>>> might return a wrong value when the counter crosses a 32bit boundary.
> >>>>>
> >>>>> Until now, there is no case for Xen itself to access CNTVCT_EL0,
> >>>>> and it also should be the Guest OS's responsibility to deal with
> >>>>> this part.
> >>>>>
> >>>>> But for CNTPCT, there exists several cases in Xen involving reading
> >>>>> CNTPCT, so a possible workaround is that performing the read twice,
> >>>>> and to return one or the other depending on whether a transition has
> >>>>> taken place.
> >>>>>
> >>>>> Signed-off-by: Penny Zheng <penny.zh...@arm.com>
> >>>>
> >>>> Acked-by: Julien Grall <jgr...@amazon.com>
> >>>>
> >>>> On a related topic, do we need a fix similar to Linux commit
> >>>> 75a19a0202db "arm64: arch_timer: Ensure counter register reads occur
> >>>> with seqlock held"?
> >>>>
> >>>
> >>> I take a look at this Linux commit, it seems to prevent the seq-lock to be
> >>> speculated.  Using an enforce ordering instead of ISB after the read 
> >>> counter
> >>> operation seems to be for performance reasons.
> >>>
> >>> I have found that you had placed an ISB before read counter in get_cycles
> >>> to prevent counter value to be speculated. But you haven't placed the
> second
> >>> ISB after reading. Is it because we haven't used the get_cycles in seq 
> >>> lock
> >>> critical context (Maybe I didn't find the right place)? So should we need 
> >>> to
> fix it
> >>> now, or you prefer to fix it now for future usage?
> >>
> >> Looking at the call sites, it doesn't look like we need any ISB after
> >> get_cycles as it is not used in any critical context. There is also a
> >> data dependency with the value returned by it.
> 
> I am assuming you looked at all the users of NOW(). Is that right?
> 
> >>
> >> So I am thinking we don't need any fix. At most we need an in-code comment?
> >
> > I agree with you to add an in-code comment. It will remind us in future when
> we
> > use the get_cycles in critical context. Adding it now will probably only 
> > lead to
> > meaningless performance degradation.
> 
> I read this as there would be no perfomance impact if we add the
> ordering it now. Did you intend to say?

Sorry about my English. I intended to say "Adding it now may introduce some
performance cost. And this performance cost may be not worth. Because Xen
may never use it in a similar scenario "

> 
> > Because Xen may never use it in a similar
> > scenario.
> 
> Right, there are two potentials approach here:
>    - Wait until there are a user
>        * Pros: Doesn't impact performance
>        * Cons: We rely on users/reviewers to catch any misuse
>    - Harden the code
>        * Pros: Less risk to introduce a bug inadvertently
>        * Cons: May impact the performance
> 
> In general, I prefer that the code is hardened by default if the
> performance impact is limited. This may save us hours of
> debugging/reproducing bug.
> 

From a preventive point of view, you're right.

> In addition, AFAICT, the x86 version of get_cycles() is already able to
> provide that ordering. So there are chances that code may rely on it.
> 
> While I don't necessarily agree to add barriers everywhere by default
> (this may have big impact on the platform). I think it is better to have
> an accurate number of cycles.
> 

As x86 had done it, I think it’s ok to do it for Arm. This will keep a function 
behaves the same on different architectures.

Thanks,
Wei Chen

> Cheers,
> 
> --
> Julien Grall

Reply via email to