The issue is the vector lane setting code looks like:
lval = lane_set(scalar, rval, lane id)
In this case, 'rval' is being used before it is ever set, but it
really could be just 0 for the first lane setting code. Thereafter,
we use the old value of input as the rval, but each time a different lane is
set.
It would be nice if there were an intrinsic that formatted correctly
from the start (something we could call like lval =
lane_set_from_array(scalar_array)).
Then 'input' would never appear as an rval before it was set.
I thought Jerin Jacob (CC'd) would have some opinion on the right fix.
There are three 'fixes' I know exist - one is to squelch the warning
(but I don't like it because it could hide future code that introduces
this), one is to create a static and use assignment, one is to replace
the first call and pass in a 0'd lane for the first one.
Actually, I think I have a patch that could work to not introduce an
assignment, but squelch the warning. Something like the following (not
tested).
---
diff --git a/lib/librte_acl/acl_run_neon.h
b/lib/librte_acl/acl_run_neon.h index 01b9766d8..37c984fef 100644
--- a/lib/librte_acl/acl_run_neon.h
+++ b/lib/librte_acl/acl_run_neon.h
@@ -165,6 +165,7 @@ search_neon_8(const struct rte_acl_ctx *ctx, const
uint8_t **data,
uint64_t index_array[8];
struct completion cmplt[8];
struct parms parms[8];
+ static int32x4_t ZEROVAL;
int32x4_t input0, input1;
acl_set_flow(&flows, cmplt, RTE_DIM(cmplt), data, results, @@ -
181,8 +182,8 @@ search_neon_8(const struct rte_acl_ctx *ctx, const
uint8_t **data,
while (flows.started > 0) {
/* Gather 4 bytes of input data for each stream. */
- input0 = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 0), input0,
0);
- input1 = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 4), input1,
0);
+ input0 = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 0),
ZEROVAL, 0);
+ input1 = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 4),
ZEROVAL, 0);
input0 = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 1), input0,
1);
input1 = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 5), input1,
1); @@
-227,6 +228,7 @@ search_neon_4(const struct rte_acl_ctx *ctx, const
uint8_t **data,
uint64_t index_array[4];
struct completion cmplt[4];
struct parms parms[4];
+ static int32x4_t ZEROVAL;
int32x4_t input;
acl_set_flow(&flows, cmplt, RTE_DIM(cmplt), data, results, @@ -
242,7 +244,7 @@ search_neon_4(const struct rte_acl_ctx *ctx, const
uint8_t **data,
while (flows.started > 0) {
/* Gather 4 bytes of input data for each stream. */
- input = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 0), input, 0);
+ input = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 0),
ZEROVAL, 0);
input = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 1), input, 1);
input = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 2), input, 2);
input = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 3), input, 3);
--
2.21.0