Re: [dpdk-dev] [EXT] [PATCH 1/3] acl: fix arm argument types

Aaron Conole Wed, 10 Apr 2019 10:20:55 -0700

Jerin Jacob Kollanukkaran <jer...@marvell.com> writes:

> On Wed, 2019-04-10 at 11:52 -0400, Aaron Conole wrote:
>> Jerin Jacob Kollanukkaran <jer...@marvell.com> writes:
>> 
>> > On Mon, 2019-04-08 at 14:24 -0400, Aaron Conole wrote:
>> > > ---------------------------------------------------------------
>> > > ----
>> > > ---
>> > > Compiler complains of argument type mismatch, like:
>> > 
>> > Can you share more details on how to reproduce this issue?
>> 
>> It will be generated using the meson build after enabling the neon
>> extension support (which isn't currently happening on ARM using meson
>> as
>> the build environment).
>
>
> Can you share the patch to enable this for testing.


Sure - I'm using these:

(needed)
1/3 - http://mails.dpdk.org/archives/dev/2019-March/128304.html
2/3 - http://mails.dpdk.org/archives/dev/2019-March/128305.html

(following only needed for travis support)
3/3 - http://mails.dpdk.org/archives/dev/2019-March/128306.html

-Aaron

> Since the additional memcpy in fastpath, I need to check the overhead
> and check the possibility to avoid the memcpy to case.
>
>
>> 
>> > We already have
>> > CFLAGS_acl_run_neon.o += -flax-vector-conversions
>> > in the Makefile.
>> > 
>> > If you are taking out -flax-vector-conversions the correct way to
>> > fix will be use vreinterpret*.
>> > 
>> > For me the code looks clean, If unnecessary casting is avoided.
>> 
>> I agree.  I merely make explicit the casts that the compiler will be
>> implicitly introducing.
>> 
>> > >    ../lib/librte_acl/acl_run_neon.h: In function ‘transition4’:
>> > >    ../lib/librte_acl/acl_run_neon.h:115:2: note: use -flax-
>> > > vector-
>> > > conversions
>> > >       to permit conversions between vectors with differing
>> > > element
>> > > types
>> > >       or numbers of subparts
>> > >      node_type = vbicq_s32(tr_hi_lo.val[0], index_msk);
>> > >      ^
>> > >    ../lib/librte_acl/acl_run_neon.h:115:41: error: incompatible
>> > > type
>> > > for
>> > >       argument 2 of ‘vbicq_s32’
>> > > 
>> > > Signed-off-by: Aaron Conole <acon...@redhat.com>
>> > > ---
>> > >  lib/librte_acl/acl_run_neon.h | 46 ++++++++++++++++++++---------
>> > > ----
>> > > --
>> > >  1 file changed, 27 insertions(+), 19 deletions(-)
>> > > 
>> > > 
>> > >  
>> > >  /*
>> > > @@ -179,6 +183,9 @@ search_neon_8(const struct rte_acl_ctx *ctx,
>> > > const uint8_t **data,
>> > >          acl_match_check_x4(0, ctx, parms, &flows, &index_array[0]);
>> > >          acl_match_check_x4(4, ctx, parms, &flows, &index_array[4]);
>> > >  
>> > > +        memset(&input0, 0, sizeof(input0));
>> > > +        memset(&input1, 0, sizeof(input1));
>> > 
>> > Why this memset only required for arm64? If it real issue,
>> > Shouldn't
>> > it required for x86 and ppc ?
>> 
>> No.  Please see the following lines (which is due to the ARM neon
>> intrinsic for setting individual lanes):
>> 
>>      while (flows.started > 0) {
>>              /* Gather 4 bytes of input data for each stream. */
>>              input0 = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 0),
>> input0, 0);
>>              input1 = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 4),
>> input1, 0);
>> 
>> Note: the first time through this loop, input0 and input1 appear on
>> the
>> rhs of the assignment before appearing on the lhs.  This will
>> generate
>> an uninitialized value warning, even though the assignments are to
>> individual lanes of the vector.
>> 
>> I squelched the warning from the compiler in the most brute-force way
>> possible.  Perhaps it would be better to use a static initialization
>> for
>> the vector but this code was intended to be RFC and to generate
>> feedback.
>> 
>> I guess one alternate approach could be:
>> 
>>    static const int32x4_t ZERO_VEC;
>>    int32x4_t input0 = ZERO_VEC, input1 = ZERO_VEC;
>> 
>>    ...
>> 
>>    int32x4_t input = ZERO_VEC;
>> 
>> This would have the benefit of keeping the initializer as 'fast' as
>> possible (although I recall a memset under a certain size threshold
>> is
>> the same effect, but not certain).
>> 
>> Either way, I prefer it to squelching the warning, since the warning
>> has been found to catch legitimate errors many times.
>
> I will get back to this after reproducing the issue locally.

Awesome - thanks.

Re: [dpdk-dev] [EXT] [PATCH 1/3] acl: fix arm argument types

Reply via email to