> -----Original Message----- > From: Srikanth Yalavarthi <syalavar...@marvell.com> > Sent: Friday, December 9, 2022 3:36 AM > To: Srikanth Yalavarthi <syalavar...@marvell.com>; Ruifeng Wang > <ruifeng.w...@arm.com> > Cc: dev@dpdk.org; sshankarn...@marvell.com; jer...@marvell.com; > apra...@marvell.com > Subject: [PATCH v1 4/4] common/ml: add Arm NEON type conversion routines > > Added ARM NEON intrinsic based implementations to support conversion of data > types. > Support is enabled to handle int8, uint8, int16, uint16, float16, float32 and > bfloat16 > types. > > Signed-off-by: Srikanth Yalavarthi <syalavar...@marvell.com> > --- > drivers/common/ml/meson.build | 5 + > drivers/common/ml/ml_utils.c | 48 ++ > drivers/common/ml/ml_utils_neon.c | 950 ++++++++++++++++++++++++++++++ > drivers/common/ml/ml_utils_neon.h | 23 + > 4 files changed, 1026 insertions(+) > create mode 100644 drivers/common/ml/ml_utils_neon.c create mode 100644 > drivers/common/ml/ml_utils_neon.h > > diff --git a/drivers/common/ml/meson.build b/drivers/common/ml/meson.build > index > 84ae84ee4e..f7ce19b4b4 100644 > --- a/drivers/common/ml/meson.build > +++ b/drivers/common/ml/meson.build > @@ -17,6 +17,11 @@ sources = files( > 'ml_utils_generic.c', > ) > > +if arch_subdir == 'arm' > + headers += files('ml_utils_neon.h') > + sources += files('ml_utils_neon.c') endif > + > deps += ['mldev'] > > pmd_supports_disable_iova_as_pa = true > diff --git a/drivers/common/ml/ml_utils.c b/drivers/common/ml/ml_utils.c index > e2edef0904..3edcf09fde 100644 > --- a/drivers/common/ml/ml_utils.c > +++ b/drivers/common/ml/ml_utils.c > @@ -120,71 +120,119 @@ ml_io_format_to_str(enum rte_ml_io_format format, char > *str, int > len) int ml_float32_to_int8(float scale, uint64_t nb_elements, void *input, > void *output) > { > +#if defined(__ARM_NEON__) > + return ml_float32_to_int8_neon(scale, nb_elements, input, output); > +#else > return ml_float32_to_int8_generic(scale, nb_elements, input, output); > +#endif > } > Maybe __rte_weak can be used to remove the ifdef clutter.
Something like: ml_utils.c __rte_weak int ml_float32_to_int8(float scale, uint64_t nb_elements, void *input, void *output) { return ml_float32_to_int8_generic(scale, nb_elements, input, output); } ml_utis_neon.c int ml_float32_to_int8(float scale, uint64_t nb_elements, void *input, void *output) { return ml_float32_to_int8_neon(scale, nb_elements, input, output); } <snip> > diff --git a/drivers/common/ml/ml_utils_neon.c > b/drivers/common/ml/ml_utils_neon.c > new file mode 100644 > index 0000000000..b660de07ec > --- /dev/null > +++ b/drivers/common/ml/ml_utils_neon.c > @@ -0,0 +1,950 @@ > +/* SPDX-License-Identifier: BSD-3-Clause > + * Copyright (c) 2022 Marvell. > + */ > + > +#include <errno.h> > +#include <math.h> > +#include <stdint.h> > + > +#include <rte_common.h> > +#include <rte_vect.h> > + > +#include "ml_utils.h" > +#include "ml_utils_neon.h" > + > +#include <arm_neon.h> This line can be removed. It is included rte_vect.h. Thanks. <snip>