Hi Julien,

> -----Original Message-----
> From: Julien Grall <jul...@xen.org>
> Sent: 2021年1月14日 2:30
> To: Wei Chen <wei.c...@arm.com>; xen-devel@lists.xenproject.org;
> sstabell...@kernel.org
> Cc: Bertrand Marquis <bertrand.marq...@arm.com>; Penny Zheng
> <penny.zh...@arm.com>; Jiamei Xie <jiamei....@arm.com>; nd
> <n...@arm.com>
> Subject: Re: [PATCH v2 2/2] xen/arm: Add defensive barrier in get_cycles for
> Arm64
> 
> 
> 
> On 09/01/2021 12:16, Wei Chen wrote:
> > HI Julien,
> 
> Hi Wei,
> 
> >> -----Original Message-----
> >> From: Julien Grall <jul...@xen.org>
> >> Sent: 2021年1月8日 19:56
> >> To: Wei Chen <wei.c...@arm.com>; xen-devel@lists.xenproject.org;
> >> sstabell...@kernel.org
> >> Cc: Bertrand Marquis <bertrand.marq...@arm.com>; Penny Zheng
> >> <penny.zh...@arm.com>; Jiamei Xie <jiamei....@arm.com>; nd
> >> <n...@arm.com>
> >> Subject: Re: [PATCH v2 2/2] xen/arm: Add defensive barrier in get_cycles 
> >> for
> >> Arm64
> >>
> >> Hi Wei,
> >>
> >> On 08/01/2021 06:21, Wei Chen wrote:
> >>> Per the discussion [1] on the mailing list, we'd better to
> >>> have a barrier after reading CNTPCT in get_cycles. If there
> >>> is not any barrier there. When get_cycles being used in some
> >>> seqlock critical context in the future, the seqlock can be
> >>> speculated potentially.
> >>>
> >>> We import Linux commit 75a19a0202db21638a1c2b424afb867e1f9a2376:
> >>>       arm64: arch_timer: Ensure counter register reads occur with seqlock
> held
> >>>
> >>>       When executing clock_gettime(), either in the vDSO or via a system 
> >>> call,
> >>>       we need to ensure that the read of the counter register occurs 
> >>> within
> >>>       the seqlock reader critical section. This ensures that updates to 
> >>> the
> >>>       clocksource parameters (e.g. the multiplier) are consistent with the
> >>>       counter value and therefore avoids the situation where time appears 
> >>> to
> >>>       go backwards across multiple reads.
> >>>
> >>>       Extend the vDSO logic so that the seqlock critical section covers 
> >>> the
> >>>       read of the counter register as well as accesses to the data page. 
> >>> Since
> >>>       reads of the counter system registers are not ordered by memory 
> >>> barrier
> >>>       instructions, introduce dependency ordering from the counter read 
> >>> to a
> >>>       subsequent memory access so that the seqlock memory barriers apply 
> >>> to
> >>>       the counter access in both the vDSO and the system call paths.
> >>>
> >>>       Cc: <sta...@vger.kernel.org>
> >>>       Cc: Marc Zyngier <marc.zyng...@arm.com>
> >>>       Tested-by: Vincenzo Frascino <vincenzo.frasc...@arm.com>
> >>>       Link: https://lore.kernel.org/linux-arm-
> >> kernel/alpine.deb.2.21.1902081950260.1...@nanos.tec.linutronix.de/
> >>>       Reported-by: Thomas Gleixner <t...@linutronix.de>
> >>>       Signed-off-by: Will Deacon <will.dea...@arm.com>
> >>>
> >>> While we are not aware of such use in Xen, it would be best to add the
> >>> barrier to avoid any suprise.
> >>>
> >>> In order to reduce the impact of new barrier, we perfer to
> >>> use enforce order instead of ISB [2].
> >>>
> >>> Currently, enforce order is not applied to arm32 as this is
> >>> not done in Linux at the date of this patch. If this is done
> >>> in Linux it will need to be also done in Xen.
> >>>
> >>> [1] https://lists.xenproject.org/archives/html/xen-devel/2020-
> >> 12/msg00181.html
> >>> [2] https://lkml.org/lkml/2020/3/13/645
> >>>
> >>> Signed-off-by: Wei Chen <wei.c...@arm.com>
> >>> ---
> >>> v1 -> v2:
> >>> 1. Update commit message to refer Linux commit.
> >>> 2. Change u64 to uint64_t
> >>> ---
> >>>    xen/include/asm-arm/time.h | 43
> >> ++++++++++++++++++++++++++++++++++++--
> >>>    1 file changed, 41 insertions(+), 2 deletions(-)
> >>>
> >>> diff --git a/xen/include/asm-arm/time.h b/xen/include/asm-arm/time.h
> >>> index 5c4529ebb5..6b8fd839dd 100644
> >>> --- a/xen/include/asm-arm/time.h
> >>> +++ b/xen/include/asm-arm/time.h
> >>> @@ -11,9 +11,26 @@
> >>>
> >>>    typedef uint64_t cycles_t;
> >>>
> >>> -static inline cycles_t get_cycles(void)
> >>> +/*
> >>> + * Ensure that reads of the counter are treated the same as memory reads
> >>> + * for the purposes of ordering by subsequent memory barriers.
> >>> + */
> >>> +#if defined(CONFIG_ARM_64)
> >>> +#define read_cntpct_enforce_ordering(val) do { \
> >>> +    uint64_t tmp, _val = (val);                \
> >>> +                                               \
> >>> +    asm volatile(                              \
> >>> +    "eor %0, %1, %1\n"                         \
> >>> +    "add %0, sp, %0\n"                         \
> >>> +    "ldr xzr, [%0]"                            \
> >>> +    : "=r" (tmp) : "r" (_val));                \
> >>> +} while (0)
> >>> +#else
> >>> +#define read_cntpct_enforce_ordering(val) do {} while (0)
> >>> +#endif
> >>> +
> >>> +static inline cycles_t read_cntpct_stable(void)
> >>
> >> OOI, is there any particular reason to create a new helper?
> >>
> >
> > Yes, I try to reduce the #if defined(CONFIG_ARM_64) chunks.
> 
> Hmmm... There is no #ifdef chunk in read_cntpct_stable(). Did I miss
> anything?

No, It was I who misunderstood your comments.

> 
> > I think
> > if we introduce an empty helper for Arm32, we can reduce the other
> > chunk inside get_cycles. In addition, if we need to do the same work
> > for Arm32 in the future, we may not need to modify get_cycles.
> I don't really follow this. I wasn't asking about
> read_cntpct_enforce_ordering(). Instead I was asking about
> read_cntpct_stable() which looks like you just split get_cycles().
> 
> This change looks completely unrelated to the purpose of this series. I
> am not going to ask to split it, but I think this should be explained in
> the commit message.
> 

Yes, I forgot to explain this changes in the commit message.

When I was trying to add read_cntpct_enforce_ordering into get_cycles,
there were three ways I can do:
1. Add read_cntpct_enforce_ordering to the end of each branch
2. Add a new cycles_t variable and record value of each branch. Using
    the function end as unique return point. And then we can add
    read_cntpct_enforce_ordering to the end of get_cycles.
3. Don't touch the get_cycles function body, just rename it and create
    another helper named get_cycles to include original function and
    read_cntpct_enforce_ordering

Personally, I prefer the #3, so I changed the function like this.

How about adding the explanation in the end of commit message like this:
"
.... If this is done in Linux it will need to be also done in Xen.

To avoid adding read_cntpct_enforce_ordering everywhere, we introduced
a new helper read_cntpct_stable to replace original get_cycles, and turn
get_cycles to a wrapper which we can add read_cntpct_enforce_ordering
easily.
"

Thanks,
Wei Chen

> Cheers,
> 
> --
> Julien Grall

Reply via email to