On Tue, Jan 12, 2021 at 8:28 AM Ruifeng Wang <ruifeng.w...@arm.com> wrote:
>
> Building with gcc 10.2 with SVE extension enabled got error:
>
> {standard input}: Assembler messages:
> {standard input}:91: Error: selected processor does not support `addvl 
> x4,x8,#-1'
> {standard input}:95: Error: selected processor does not support `ptrue 
> p1.d,all'
> {standard input}:135: Error: selected processor does not support `whilelo 
> p2.d,xzr,x5'
> {standard input}:137: Error: selected processor does not support `decb x1'
>
> This is because inline assembly code explicitly resets cpu model to
> not have SVE support. Thus SVE instructions generated by compiler
> auto vectorization got rejected by assembler.
>
> Added SVE to the cpu model specified by inline assembly for SVE support.
> Not replacing the inline assembly with C atomics because the driver relies
> on specific LSE instruction to interface to co-processor [1].
>
> Fixes: f0c7bb1bf778 ("net/octeontx/base: add octeontx IO operations")
> Cc: jer...@marvell.com
> Cc: sta...@dpdk.org
>
> [1] https://mails.dpdk.org/archives/dev/2021-January/196092.html
>
> Signed-off-by: Ruifeng Wang <ruifeng.w...@arm.com>


Reviewed-by: Jerin Jacob <jer...@marvell.com>


> ---
> v3:
> Keep inline assembly and add sve extension to fix issue. (Pavan)
>
>  drivers/net/octeontx/base/octeontx_io.h | 10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/octeontx/base/octeontx_io.h 
> b/drivers/net/octeontx/base/octeontx_io.h
> index 04b9ce191..d0b9cfbc6 100644
> --- a/drivers/net/octeontx/base/octeontx_io.h
> +++ b/drivers/net/octeontx/base/octeontx_io.h
> @@ -52,6 +52,11 @@ do {                                                 \
>  #endif
>
>  #if defined(RTE_ARCH_ARM64)
> +#if defined(__ARM_FEATURE_SVE)
> +#define __LSE_PREAMBLE " .cpu  generic+lse+sve\n"
> +#else
> +#define __LSE_PREAMBLE " .cpu  generic+lse\n"
> +#endif
>  /**
>   * Perform an atomic fetch-and-add operation.
>   */
> @@ -61,7 +66,7 @@ octeontx_reg_ldadd_u64(void *addr, int64_t off)
>         uint64_t old_val;
>
>         __asm__ volatile(
> -               " .cpu          generic+lse\n"
> +               __LSE_PREAMBLE
>                 " ldadd %1, %0, [%2]\n"
>                 : "=r" (old_val) : "r" (off), "r" (addr) : "memory");
>
> @@ -98,12 +103,13 @@ octeontx_reg_lmtst(void *lmtline_va, void *ioreg_va, 
> const uint64_t cmdbuf[],
>
>                 /* LDEOR initiates atomic transfer to I/O device */
>                 __asm__ volatile(
> -                       " .cpu          generic+lse\n"
> +                       __LSE_PREAMBLE
>                         " ldeor xzr, %0, [%1]\n"
>                         : "=r" (result) : "r" (ioreg_va) : "memory");
>         } while (!result);
>  }
>
> +#undef __LSE_PREAMBLE
>  #else
>
>  static inline uint64_t
> --
> 2.25.1
>

Reply via email to