On Wed, 14 Feb 2024, Richard Sandiford wrote: > Richard Biener <rguent...@suse.de> writes: > > The following avoids accessing out-of-bound vector elements when > > native encoding a boolean vector with sub-BITS_PER_UNIT precision > > elements. The error was basing the number of elements to extract > > on the rounded up total byte size involved and the patch bases > > everything on the total number of elements to extract instead. > > It's too long ago to be certain, but I think this was a deliberate choice. > The point of the new vector constant encoding is that it can give an > allegedly sensible value for any given index, even out-of-range ones. > > Since the padding bits are undefined, we should in principle have a free > choice of what to use. And for VLA, it's often better to continue the > existing pattern rather than force to zero. > > I don't strongly object to changing it. I think we should be careful > about relying on zeroing for correctness though. The bits are in principle > undefined and we can't rely on reading zeros from equivalent memory or > register values.
The main motivation for a change here is to allow catching out-of-bound indices again for VECTOR_CST_ELT, at least for constant nunits because it might be a programming error like fat-fingering the index. I do think it's a regression that we no longer catch those. It's probably also a bit non-obvious how an encoding continues and there might be DImode masks that can be represented by a zero-extended QImode immediate but "continued" it would require a larger immediate. The change also effectively only changes something for 1 byte encodings since nunits is a power of two and so is the element size in bits. A patch restoring the VECTOR_CST_ELT checking might be the following diff --git a/gcc/tree.cc b/gcc/tree.cc index 046a558d1b0..4c9b05167fd 100644 --- a/gcc/tree.cc +++ b/gcc/tree.cc @@ -10325,6 +10325,9 @@ vector_cst_elt (const_tree t, unsigned int i) if (i < encoded_nelts) return VECTOR_CST_ENCODED_ELT (t, i); + /* Catch out-of-bound element accesses. */ + gcc_checking_assert (maybe_gt (VECTOR_CST_NELTS (t), i)); + /* If there are no steps, the final encoded value is the right one. */ if (!VECTOR_CST_STEPPED_P (t)) { but it triggers quite a bit via const_binop for, for example #2 0x00000000011c1506 in const_binop (code=PLUS_EXPR, arg1=<vector_cst 0x7ffff6f40b40>, arg2=<vector_cst 0x7ffff6e731e0>) (gdb) p debug_generic_expr (arg1) { 12, 13, 14, 15 } $5 = void (gdb) p debug_generic_expr (arg2) { -2, -2, -2, -3 } (gdb) p count $4 = 6 (gdb) l 1711 if (!elts.new_binary_operation (type, arg1, arg2, step_ok_p)) 1712 return NULL_TREE; 1713 unsigned int count = elts.encoded_nelts (); 1714 for (unsigned int i = 0; i < count; ++i) 1715 { 1716 tree elem1 = VECTOR_CST_ELT (arg1, i); 1717 tree elem2 = VECTOR_CST_ELT (arg2, i); 1718 1719 tree elt = const_binop (code, elem1, elem2); this seems like an error to me - why would we, for fixed-size vectors and for PLUS ever create a vector encoding with 6 elements?! That seems at least inefficient to me? Richard. > Thanks, > Richard > > > > As a side-effect this now consistently results in zeros in the > > padding of the last encoded byte which also avoids the failure > > mode seen in PR113576. > > > > Bootstrapped and tested on x86_64-unknown-linux-gnu. > > > > OK? > > > > Thanks, > > Richard. > > > > PR middle-end/113576 > > * fold-const.cc (native_encode_vector_part): Avoid accessing > > out-of-bound elements. > > --- > > gcc/fold-const.cc | 8 ++++---- > > 1 file changed, 4 insertions(+), 4 deletions(-) > > > > diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc > > index 80e211e18c0..8638757312b 100644 > > --- a/gcc/fold-const.cc > > +++ b/gcc/fold-const.cc > > @@ -8057,13 +8057,13 @@ native_encode_vector_part (const_tree expr, > > unsigned char *ptr, int len, > > off = 0; > > > > /* Zero the buffer and then set bits later where necessary. */ > > - int extract_bytes = MIN (len, total_bytes - off); > > + unsigned elts_per_byte = BITS_PER_UNIT / elt_bits; > > + unsigned first_elt = off * elts_per_byte; > > + unsigned extract_elts = MIN (len * elts_per_byte, count - first_elt); > > + unsigned extract_bytes = CEIL (elt_bits * extract_elts, > > BITS_PER_UNIT); > > if (ptr) > > memset (ptr, 0, extract_bytes); > > > > - unsigned int elts_per_byte = BITS_PER_UNIT / elt_bits; > > - unsigned int first_elt = off * elts_per_byte; > > - unsigned int extract_elts = extract_bytes * elts_per_byte; > > for (unsigned int i = 0; i < extract_elts; ++i) > > { > > tree elt = VECTOR_CST_ELT (expr, first_elt + i); > -- Richard Biener <rguent...@suse.de> SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)