ml: add Arm NEON type conversion routines

Ruifeng Wang Sun, 11 Dec 2022 23:16:19 -0800

> -----Original Message-----
> From: Srikanth Yalavarthi <syalavar...@marvell.com>
> Sent: Friday, December 9, 2022 3:36 AM
> To: Srikanth Yalavarthi <syalavar...@marvell.com>; Ruifeng Wang 
> <ruifeng.w...@arm.com>
> Cc: dev@dpdk.org; sshankarn...@marvell.com; jer...@marvell.com; 
> apra...@marvell.com
> Subject: [PATCH v1 4/4] common/ml: add Arm NEON type conversion routines
> 
> Added ARM NEON intrinsic based implementations to support conversion of data 
> types.
> Support is enabled to handle int8, uint8, int16, uint16, float16, float32 and 
> bfloat16
> types.
> 
> Signed-off-by: Srikanth Yalavarthi <syalavar...@marvell.com>
> ---
>  drivers/common/ml/meson.build     |   5 +
>  drivers/common/ml/ml_utils.c      |  48 ++
>  drivers/common/ml/ml_utils_neon.c | 950 ++++++++++++++++++++++++++++++
> drivers/common/ml/ml_utils_neon.h |  23 +
>  4 files changed, 1026 insertions(+)
>  create mode 100644 drivers/common/ml/ml_utils_neon.c  create mode 100644
> drivers/common/ml/ml_utils_neon.h
> 
> diff --git a/drivers/common/ml/meson.build b/drivers/common/ml/meson.build 
> index
> 84ae84ee4e..f7ce19b4b4 100644
> --- a/drivers/common/ml/meson.build
> +++ b/drivers/common/ml/meson.build
> @@ -17,6 +17,11 @@ sources = files(
>          'ml_utils_generic.c',
>  )
> 
> +if arch_subdir == 'arm'
> +    headers += files('ml_utils_neon.h')
> +    sources += files('ml_utils_neon.c') endif
> +
>  deps += ['mldev']
> 
>  pmd_supports_disable_iova_as_pa = true
> diff --git a/drivers/common/ml/ml_utils.c b/drivers/common/ml/ml_utils.c index
> e2edef0904..3edcf09fde 100644
> --- a/drivers/common/ml/ml_utils.c
> +++ b/drivers/common/ml/ml_utils.c
> @@ -120,71 +120,119 @@ ml_io_format_to_str(enum rte_ml_io_format format, char 
> *str, int
> len)  int  ml_float32_to_int8(float scale, uint64_t nb_elements, void *input, 
> void *output)
> {
> +#if defined(__ARM_NEON__)
> +     return ml_float32_to_int8_neon(scale, nb_elements, input, output);
> +#else
>       return ml_float32_to_int8_generic(scale, nb_elements, input, output);
> +#endif
>  }
> 
Maybe __rte_weak can be used to remove the ifdef clutter.


Something like:
ml_utils.c
__rte_weak int ml_float32_to_int8(float scale, uint64_t nb_elements, void 
*input, void *output)
{
        return ml_float32_to_int8_generic(scale, nb_elements, input, output);
}
ml_utis_neon.c
int ml_float32_to_int8(float scale, uint64_t nb_elements, void *input, void 
*output)
{
        return ml_float32_to_int8_neon(scale, nb_elements, input, output);
}

<snip>
> diff --git a/drivers/common/ml/ml_utils_neon.c 
> b/drivers/common/ml/ml_utils_neon.c
> new file mode 100644
> index 0000000000..b660de07ec
> --- /dev/null
> +++ b/drivers/common/ml/ml_utils_neon.c
> @@ -0,0 +1,950 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright (c) 2022 Marvell.
> + */
> +
> +#include <errno.h>
> +#include <math.h>
> +#include <stdint.h>
> +
> +#include <rte_common.h>
> +#include <rte_vect.h>
> +
> +#include "ml_utils.h"
> +#include "ml_utils_neon.h"
> +
> +#include <arm_neon.h>
This line can be removed. It is included rte_vect.h.

Thanks.
<snip>

RE: [PATCH v1 4/4] common/ml: add Arm NEON type conversion routines

Reply via email to