> -----Original Message----- > From: Srikanth Yalavarthi <syalavar...@marvell.com> > Sent: 15 March 2023 16:12 > To: Ruifeng Wang <ruifeng.w...@arm.com> > Cc: dev@dpdk.org; Shivah Shankar Shankar Narayan Rao > <sshankarn...@marvell.com>; david.march...@redhat.com; nd > <n...@arm.com>; Srikanth Yalavarthi <syalavar...@marvell.com>; Srikanth > Yalavarthi <syalavar...@marvell.com> > Subject: RE: [PATCH v2 1/1] mldev: split bfloat16 routines to separate files > > > -----Original Message----- > > From: Ruifeng Wang <ruifeng.w...@arm.com> > > Sent: 15 March 2023 15:32 > > To: Srikanth Yalavarthi <syalavar...@marvell.com> > > Cc: dev@dpdk.org; Shivah Shankar Shankar Narayan Rao > > <sshankarn...@marvell.com>; david.march...@redhat.com; nd > <n...@arm.com> > > Subject: [EXT] RE: [PATCH v2 1/1] mldev: split bfloat16 routines to > > separate files > > > > External Email > > > > ---------------------------------------------------------------------- > > > -----Original Message----- > > > From: Srikanth Yalavarthi <syalavar...@marvell.com> > > > Sent: Monday, March 13, 2023 8:03 PM > > > To: Srikanth Yalavarthi <syalavar...@marvell.com>; Ruifeng Wang > > > <ruifeng.w...@arm.com> > > > Cc: dev@dpdk.org; sshankarn...@marvell.com; > > david.march...@redhat.com > > > Subject: [PATCH v2 1/1] mldev: split bfloat16 routines to separate > > > files > > > > > > Since bfloat16 intrinsics are not supported on all ARM platforms > > > that support NEON, > > > bfloat16 routines are moved to separate files. > > > This would enable using scalar implementation for bfloat16 on > > > unsupported > > ARM platforms. > > > > > > Bugzilla ID: 1179 > > > Fixes: fc54766b1612 ("mldev: add Arm NEON type conversion") > > > > > > Signed-off-by: Srikanth Yalavarthi <syalavar...@marvell.com> > > > --- > > > Depends-on: patch-120653 ("mldev: remove weak symbols use in type > > > conversions") > > > Depends-on: patch-125035 ("mldev: fix identical code in conditional > > > branches") > > > > > > lib/mldev/meson.build | 11 +- > > > lib/mldev/mldev_utils_neon.c | 142 +------------ > > > lib/mldev/mldev_utils_neon_bfloat16.c | 154 ++++++++++++++ > > > lib/mldev/mldev_utils_scalar.c | 262 +----------------------- > > > lib/mldev/mldev_utils_scalar.h | 80 ++++++++ > > > lib/mldev/mldev_utils_scalar_bfloat16.c | 197 ++++++++++++++++++ > > > 6 files changed, 445 insertions(+), 401 deletions(-) create mode > > > 100644 lib/mldev/mldev_utils_neon_bfloat16.c > > > create mode 100644 lib/mldev/mldev_utils_scalar.h create mode > > > 100644 lib/mldev/mldev_utils_scalar_bfloat16.c > > > > > > diff --git a/lib/mldev/meson.build b/lib/mldev/meson.build index > > > c9db42257b..5769b0640a > > > 100644 > > > --- a/lib/mldev/meson.build > > > +++ b/lib/mldev/meson.build > > > @@ -7,12 +7,21 @@ sources = files( > > > 'mldev_utils.c', > > > ) > > > > > > -if dpdk_conf.has('RTE_ARCH_ARM64') > > > +if (dpdk_conf.has('RTE_ARCH_ARM64') and > > > + cc.get_define('__ARM_NEON', args: machine_args) != '') > > > > I found in ACLE document that "__ARM_NEON" is always set to 1 for > > AArch64". > > So this line of check is redundant? > > Checking for __ARM_NEON should be enough. > We can drop the dpdk_conf.has('RTE_ARCH_ARM64') check. > I will test the builds and submit a revised patch. >
Correction. Ideally checking for RTE_ARCH_ARM64 is enough. But, __ARM_NEON check is required when building with gcc-4.8.x I have tested this on CentOS-7 with GCC-4.8.5 Refer https://bugs.dpdk.org/show_bug.cgi?id=1179 Below errors, are reported with GCC-4.8, when __ARM_NEON check is not used ../lib/mldev/mldev_utils_neon.c:220:2: warning: nested extern declaration of 'vcvtas_u32_f32' [-Wnested-externs] ../lib/mldev/mldev_utils_neon.c: In function '__uint8_to_float32_neon_f32x1': ../lib/mldev/mldev_utils_neon.c:297:2: warning: implicit declaration of function 'vcvts_f32_u32' [-Wimplicit-function-declaration] *output = scale * vcvts_f32_u32((uint32_t)*input); ^ ../lib/mldev/mldev_utils_neon.c:297:2: warning: nested extern declaration of 'vcvts_f32_u32' [-Wnested-externs] ../lib/mldev/mldev_utils_neon.c: At top level: ../lib/mldev/mldev_utils_neon.c:604:51: error: unknown type name 'float16_t' __float32_to_float16_neon_f16x4(float32_t *input, float16_t *output) ^ So, we will need both checks. > > > > > sources += files('mldev_utils_neon.c') else > > > sources += files('mldev_utils_scalar.c') endif > > > > > > +if (dpdk_conf.has('RTE_ARCH_ARM64') and > > > + cc.get_define('__ARM_NEON', args: machine_args) != '' and > > > > Same here. > > > > > + cc.get_define('__ARM_FEATURE_BF16', args: machine_args) != '') > > > + sources += files('mldev_utils_neon_bfloat16.c') > > > +else > > > + sources += files('mldev_utils_scalar_bfloat16.c') > > > +endif > > > + > > > headers = files( > > > 'rte_mldev.h', > > > ) > > <snip>