On Fri, 24 Feb 2023, Richard Biener wrote: > On Thu, 23 Feb 2023, Jakub Jelinek wrote: > > > On Thu, Feb 23, 2023 at 03:02:01PM +0000, Richard Biener wrote: > > > > > * vec.h (auto_vec<T, N>): Turn m_data storage into > > > > > uninitialized unsigned char. > > > > > > > > Given that we actually never reference the m_data array anywhere, > > > > it is just to reserve space, I think even the alignas(T) there is > > > > useless. The point is that m_auto has as data members: > > > > vec_prefix m_vecpfx; > > > > T m_vecdata[1]; > > > > and we rely on it (admittedly -fstrict-flex-arrays{,=2,=3} or > > > > -fsanitize=bound-sstrict incompatible) being treated as > > > > flexible array member flowing into the m_data storage after it. > > > > > > Doesn't the array otherwise eventually overlap with tail padding > > > in m_auto? Or does an array of T never produce tail padding? > > > > The array can certainly overlap with tail padding in m_auto if any. > > But whether m_data is aligned to alignof (T) or not doesn't change anything > > on it. > > m_vecpfx is struct { unsigned m_alloc : 31, m_using_auto_storage : 1, > > m_num; }, > > so I think there is on most arches tail padding if T has smaller alignment > > than int, so typically char/short or structs with the same size/alignments. > > If that happens, alignof (auto_vec_x.m_auto) will be alignof (int), > > there can be 2 or 3 padding bytes, but because sizeof (auto_vec_x.m_auto) > > is 3 * sizeof (int), m_data will have offset always aligned to alignof (T). > > If alignof (T) >= alignof (int), then there won't be any tail padding > > at the end of m_auto, there could be padding between m_vecpfx and > > m_vecdata, sizeof (auto_vec_x.m_auto) will be a multiple of sizeof (T) and > > so m_data will be again already properly aligned. > > > > So, I think your patch is fine without alignas(T), the rest is just that > > there is more work to do incrementally, even for the case you want to > > deal with (the point 1) in particular). > > Looking at vec<T, A, vl_embed>::operator[] which just does > > template<typename T, typename A> > inline const T & > vec<T, A, vl_embed>::operator[] (unsigned ix) const > { > gcc_checking_assert (ix < m_vecpfx.m_num); > return m_vecdata[ix]; > } > > the whole thing looks fragile at best - we basically have > > struct auto_vec > { > struct vec<vl_embed> > { > ... > T m_vecdata[1]; > } m_auto; > T m_data[N-1]; > }; > > and access m_auto.m_vecdata[] as if it extends to m_data. That's > not something supported by the middle-end - not by design at least. > auto_vec *p; p->m_auto.m_vecdata[i] would never alias > p->m_data[j], in practice we might not see this though. Also > get_ref_base_and_extent will compute a maxsize/size of sizeof(T) > for any m_auto.m_vecdata[i] access, but I think we nowhere > actually replace 'i' by zero based on this knowledge, but we'd > perform CSE with earlier m_auto.m_vecdata[0] stores, so that > might be something one could provoke. Doing a self-test like > > static __attribute__((noipa)) void > test_auto_alias (int i) > { > auto_vec<int, 8> v; > v.quick_grow (2); > v[0] = 1; > v[1] = 2; > int val = v[i]; > ASSERT_EQ (val, 2); > } > > shows > > _27 = &_25->m_vecdata[0]; > *_27 = 1; > ... > _7 = &_12->m_vecdata[i.235_3]; > val_13 = *_7; > > which is safe in middle-end rules though. So what "saves" us > here is that we always return a reference and never a value. > There's the ::iterate member function which fails to do this, > the ::quick_push function does > > T *slot = &m_vecdata[m_vecpfx.m_num++]; > *slot = obj; > > with > > static __attribute__((noipa)) void > test_auto_alias (int i) > { > auto_vec<int, 8> v; > v.quick_grow (2); > v[0] = 1; > v[1] = 2; > int val; > for (int ix = i; v.iterate (ix, &val); ix++) > ; > ASSERT_EQ (val, 2); > } > > I get that optimzied to a FAIL. I have a "fix" for this. > unordered_remove has a similar issue accesing the last element.
Turns out forwprop "breaks" this still, so the fix doesn't work. That means we have a hole here in the middle-end. We can avoid this by obfuscating things even more, like to const T *first = m_vecdata; const T *slot = first + ix; *ptr = *slot; which at least for variable 'ix' avoids forwprop from triggering. But this also means that the existing [] accessor isn't really safe, we're just lucky that we turn constant accesses to MEM[(int &)&v + off] = val; and that we now have PR108355 which made the get_ref_base_and_extent info used less often in VN. I'm testing the patch now without the new selftest, it should be good to avoid these issues for constant indexes. I can also split the patch up. But in the end I think we have to fix auto_vec in a better way. Richard. > There are a few functions using the [] access member which is > at least sub-optimal due to repeated bounds checking but also safe. > > I suppose if auto_vec would be a union of vec<vl_embed> and > a storage member with the vl_embed active that would work, but then > that's likely not something C++11 supports. > > So I think to support auto_vec we'd need to make the m_vecdata[] > member in vec<vl_embed> of templated size (defaulted to 1) > and get rid of the m_data member in auto_vec instead. Or have > another C++ way of increasing the size of auto_vec without > actually adding any member? > > The vec<vl_embed> data accesses then would need to go through > a wrapper obtaining a correctly typed pointer to m_vecdata[] > since we'd like to have that as unsigned char[] to avoid the > initialization. > > > > Yes, I'm not proposing to fix non-POD support. I want to make > > > as-if-POD stuff like std::pair to work like it was intended. > > > > > > > Oh, and perhaps we should start marking such spots in GCC with > > > > strict_flex_array attribute to make it clear where we rely on the > > > > non-strict behavior. > > > > > > I think we never access the array directly as array, do we? > > > > Sure, the attribute should go to m_vecdata array, not to m_data. > > And to op array in gimple_statement_with_ops, operands array in > > operands, ops array in tree_omp_clause, val in tree_int_cst, > > str in tree_string, elts in tree_vector, a in tree_vec, elem in rtvec_def, > > elem in hwivec_def, u.{fld,hwint} in rtx_def and various others, we > > use this stuff everywhere. Also often use GTY length similarly to the > > proposed element_count... > > Here's a patch to fix vec.h to adhere to GCC middle-ends interpretation > of array accesses - &array[i] is just address arithmetic and out-of-bounds > accesses are OK. But that doesn't prevent other host compilers from > miscompiling stage1 I guess, I'm not sure if there's any standard > conforming way to compute the address of an element in m_data from > the address of m_vecdata? > > Bootstrap and regtest running on x86_64-unknown-linux-gnu, OK? > > Thanks, > Richard. > > > From b7d47dd1c166d216eba7160f11087312f8b5bbef Mon Sep 17 00:00:00 2001 > From: Richard Biener <rguent...@suse.de> > Date: Fri, 24 Feb 2023 09:54:09 +0100 > Subject: [PATCH] Fix vec.h alias problems > To: gcc-patches@gcc.gnu.org > > With auto_vec we have embedded storage that spans two members, > the m_auto.m_vecdata[1] and the m_data[N-1] arrays and accesses > to the data are all based on m_auto.m_vecdata[]. That's OK for > GCC as long as the accesses are done indirectly through a pointer > since the address computation based on m_auto.m_vecdata[] is > considered OK by GCC (but not by the language standards). > > The following fixes the few places that failed to enfoce this > indirection and adds one self-test that failed before. > > As I was here it also fixes ::contains to avoid repeated bounds > checking and the same issue in :;lower_bound which also suffers > from unnecessary copying around values. > > * vec.h (vec<T, A, vl_embed>::lower_bound): Adjust to > take a const reference to the object, access m_vecdata > directly. > (vec<T, A, vl_embed>::contains): Likewise. > (vec<T, A, vl_embed>::iterate): Perform accesses to > m_vecdata indirectly. > (vec<T, A, vl_embed>::unordered_remove): Likewise. > * vec.cc (test_auto_alias): New. > (vec_cc_tests): Call it. > --- > gcc/vec.cc | 17 +++++++++++++++++ > gcc/vec.h | 21 ++++++++++++++------- > 2 files changed, 31 insertions(+), 7 deletions(-) > > diff --git a/gcc/vec.cc b/gcc/vec.cc > index 511e6dff50d..afba66cb3d0 100644 > --- a/gcc/vec.cc > +++ b/gcc/vec.cc > @@ -568,6 +568,22 @@ test_auto_delete_vec () > ASSERT_EQ (dtor_count, 2); > } > > +/* Verify accesses to m_vecdata are done indirectly. */ > + > +static void > +test_auto_alias () > +{ > + volatile int i = 1; > + auto_vec<int, 8> v; > + v.quick_grow (2); > + v[0] = 1; > + v[1] = 2; > + int val; > + for (int ix = i; v.iterate (ix, &val); ix++) > + ; > + ASSERT_EQ (val, 2); > +} > + > /* Run all of the selftests within this file. */ > > void > @@ -587,6 +603,7 @@ vec_cc_tests () > test_qsort (); > test_reverse (); > test_auto_delete_vec (); > + test_auto_alias (); > } > > } // namespace selftest > diff --git a/gcc/vec.h b/gcc/vec.h > index cee96254a31..b3ec977c23a 100644 > --- a/gcc/vec.h > +++ b/gcc/vec.h > @@ -614,7 +614,7 @@ public: > T *bsearch (const void *key, int (*compar)(const void *, const void *)); > T *bsearch (const void *key, > int (*compar)(const void *, const void *, void *), void *); > - unsigned lower_bound (T, bool (*)(const T &, const T &)) const; > + unsigned lower_bound (const T &, bool (*)(const T &, const T &)) const; > bool contains (const T &search) const; > static size_t embedded_size (unsigned); > void embedded_init (unsigned, unsigned = 0, unsigned = 0); > @@ -929,7 +929,8 @@ vec<T, A, vl_embed>::iterate (unsigned ix, T *ptr) const > { > if (ix < m_vecpfx.m_num) > { > - *ptr = m_vecdata[ix]; > + const T *slot = &m_vecdata[ix]; > + *ptr = *slot; > return true; > } > else > @@ -1118,7 +1119,9 @@ inline void > vec<T, A, vl_embed>::unordered_remove (unsigned ix) > { > gcc_checking_assert (ix < length ()); > - m_vecdata[ix] = m_vecdata[--m_vecpfx.m_num]; > + const T *last_slot = &m_vecdata[--m_vecpfx.m_num]; > + T *slot = &m_vecdata[ix]; > + *slot = *last_slot; > } > > > @@ -1249,8 +1252,11 @@ vec<T, A, vl_embed>::contains (const T &search) const > { > unsigned int len = length (); > for (unsigned int i = 0; i < len; i++) > - if ((*this)[i] == search) > - return true; > + { > + const T *slot = &m_vecdata[i]; > + if (*slot == search) > + return true; > + } > > return false; > } > @@ -1262,7 +1268,8 @@ vec<T, A, vl_embed>::contains (const T &search) const > > template<typename T, typename A> > unsigned > -vec<T, A, vl_embed>::lower_bound (T obj, bool (*lessthan)(const T &, const T > &)) > +vec<T, A, vl_embed>::lower_bound (const T &obj, > + bool (*lessthan)(const T &, const T &)) > const > { > unsigned int len = length (); > @@ -1273,7 +1280,7 @@ vec<T, A, vl_embed>::lower_bound (T obj, bool > (*lessthan)(const T &, const T &)) > half = len / 2; > middle = first; > middle += half; > - T middle_elem = (*this)[middle]; > + const T &middle_elem = this->m_vecdata[middle]; > if (lessthan (middle_elem, obj)) > { > first = middle; > -- Richard Biener <rguent...@suse.de> SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman; HRB 36809 (AG Nuernberg)