On Wed, Mar 26, 2025 at 11:39:28AM +0100, David Marchand wrote: > Caught in OBS for Fedora Rawhide on aarch64: > > [ 198s] In file included from ../lib/acl/acl_run_neon.h:7, > [ 198s] from ../lib/acl/acl_run_neon.c:5: > [ 198s] In function ‘alloc_completion’, > [ 198s] inlined from ‘acl_start_next_trie’ at > ../lib/acl/acl_run.h:140:24, > [ 198s] inlined from ‘search_neon_4.isra’ at > ../lib/acl/acl_run_neon.h:239:20: > [ 198s] ../lib/acl/acl_run.h:93:25: error: ‘cmplt’ may be used > uninitialized [-Werror=maybe-uninitialized] > [ 198s] 93 | if (p[n].count == 0) { > [ 198s] | ~~~~^~~~~~ > [ 198s] ../lib/acl/acl_run_neon.h: In function ‘search_neon_4.isra’: > [ 198s] ../lib/acl/acl_run_neon.h:230:27: note: ‘cmplt’ declared here > [ 198s] 230 | struct completion cmplt[4]; > [ 198s] | ^~~~~ > > The code was resetting sequentially cmpl[].count at the exact index that > later call to alloc_completion uses. > While this code seems correct, GCC 15 does not understand this (probably > when applying some optimisations). > > Instead, reset cmpl[].count all at once in acl_set_flow, and cleanup the > various vectorized implementations accordingly. > > Bugzilla ID: 1678 > Cc: sta...@dpdk.org > > Signed-off-by: David Marchand <david.march...@redhat.com> > --- > lib/acl/acl_run.h | 5 +++++ > lib/acl/acl_run_altivec.h | 8 ++------ > lib/acl/acl_run_avx2.h | 4 +--- > lib/acl/acl_run_neon.h | 8 ++------ > lib/acl/acl_run_scalar.c | 4 +--- > lib/acl/acl_run_sse.h | 8 ++------ > 6 files changed, 13 insertions(+), 24 deletions(-) > > diff --git a/lib/acl/acl_run.h b/lib/acl/acl_run.h > index 7f092413cd..9fd3e60021 100644 > --- a/lib/acl/acl_run.h > +++ b/lib/acl/acl_run.h > @@ -176,6 +176,8 @@ acl_set_flow(struct acl_flow_data *flows, struct > completion *cmplt, > uint32_t cmplt_size, const uint8_t **data, uint32_t *results, > uint32_t data_num, uint32_t categories, const uint64_t *trans) > { > + unsigned int i; > + > flows->num_packets = 0; > flows->started = 0; > flows->trie = 0; > @@ -187,6 +189,9 @@ acl_set_flow(struct acl_flow_data *flows, struct > completion *cmplt, > flows->data = data; > flows->results = results; > flows->trans = trans; > + > + for (i = 0; i < cmplt_size; i++) > + cmplt[i].count = 0; > }
Minor nit, but since we are using c11 standard, is it not better to declare "i" inside the "for" statement. Keeps diffs simpler for adding/removing code, I think. For the rest of the code, I need to take a bit more time reviewing to be sure I understand the change. I'll try and get to it later. /Bruce > > typedef void (*resolve_priority_t) > diff --git a/lib/acl/acl_run_altivec.h b/lib/acl/acl_run_altivec.h > index 2d398ffded..d5ccdb94f0 100644 > --- a/lib/acl/acl_run_altivec.h > +++ b/lib/acl/acl_run_altivec.h > @@ -199,10 +199,8 @@ search_altivec_8(const struct rte_acl_ctx *ctx, const > uint8_t **data, > acl_set_flow(&flows, cmplt, RTE_DIM(cmplt), data, results, > total_packets, categories, ctx->trans_table); > > - for (n = 0; n < MAX_SEARCHES_ALTIVEC8; n++) { > - cmplt[n].count = 0; > + for (n = 0; n < MAX_SEARCHES_ALTIVEC8; n++) > index_array[n] = acl_start_next_trie(&flows, parms, n, ctx); > - } > <snip>