Re: [PATCH] Avoid default-initializing auto_vec storage

Richard Biener via Gcc-patches Fri, 24 Feb 2023 01:35:00 -0800

On Fri, 24 Feb 2023, Richard Biener wrote:

> On Thu, 23 Feb 2023, Jakub Jelinek wrote:
> 
> > On Thu, Feb 23, 2023 at 03:02:01PM +0000, Richard Biener wrote:
> > > > >       * vec.h (auto_vec<T, N>): Turn m_data storage into
> > > > >       uninitialized unsigned char.
> > > > 
> > > > Given that we actually never reference the m_data array anywhere,
> > > > it is just to reserve space, I think even the alignas(T) there is
> > > > useless.  The point is that m_auto has as data members:
> > > >   vec_prefix m_vecpfx;
> > > >   T m_vecdata[1];
> > > > and we rely on it (admittedly -fstrict-flex-arrays{,=2,=3} or
> > > > -fsanitize=bound-sstrict incompatible) being treated as
> > > > flexible array member flowing into the m_data storage after it.
> > > 
> > > Doesn't the array otherwise eventually overlap with tail padding
> > > in m_auto?  Or does an array of T never produce tail padding?
> > 
> > The array can certainly overlap with tail padding in m_auto if any.
> > But whether m_data is aligned to alignof (T) or not doesn't change anything
> > on it.
> > m_vecpfx is struct { unsigned m_alloc : 31, m_using_auto_storage : 1, 
> > m_num; },
> > so I think there is on most arches tail padding if T has smaller alignment
> > than int, so typically char/short or structs with the same size/alignments.
> > If that happens, alignof (auto_vec_x.m_auto) will be alignof (int),
> > there can be 2 or 3 padding bytes, but because sizeof (auto_vec_x.m_auto)
> > is 3 * sizeof (int), m_data will have offset always aligned to alignof (T).
> > If alignof (T) >= alignof (int), then there won't be any tail padding
> > at the end of m_auto, there could be padding between m_vecpfx and
> > m_vecdata, sizeof (auto_vec_x.m_auto) will be a multiple of sizeof (T) and
> > so m_data will be again already properly aligned.
> > 
> > So, I think your patch is fine without alignas(T), the rest is just that
> > there is more work to do incrementally, even for the case you want to
> > deal with (the point 1) in particular).
> 
> Looking at vec<T, A, vl_embed>::operator[] which just does
> 
> template<typename T, typename A>
> inline const T &
> vec<T, A, vl_embed>::operator[] (unsigned ix) const
> {
>   gcc_checking_assert (ix < m_vecpfx.m_num);
>   return m_vecdata[ix];
> } 
> 
> the whole thing looks fragile at best - we basically have
> 
> struct auto_vec
> {
>   struct vec<vl_embed>
>   {
> ...
>     T m_vecdata[1];
>   } m_auto;
>   T m_data[N-1];
> };
> 
> and access m_auto.m_vecdata[] as if it extends to m_data.  That's
> not something supported by the middle-end - not by design at least.
> auto_vec *p; p->m_auto.m_vecdata[i] would never alias
> p->m_data[j], in practice we might not see this though.  Also
> get_ref_base_and_extent will compute a maxsize/size of sizeof(T)
> for any m_auto.m_vecdata[i] access, but I think we nowhere
> actually replace 'i' by zero based on this knowledge, but we'd
> perform CSE with earlier m_auto.m_vecdata[0] stores, so that
> might be something one could provoke.  Doing a self-test like
> 
> static __attribute__((noipa)) void
> test_auto_alias (int i)
> { 
>   auto_vec<int, 8> v;
>   v.quick_grow (2);
>   v[0] = 1;
>   v[1] = 2;
>   int val = v[i];
>   ASSERT_EQ (val, 2);
> } 
> 
> shows
> 
>   _27 = &_25->m_vecdata[0];
>   *_27 = 1;
> ...
>   _7 = &_12->m_vecdata[i.235_3];
>   val_13 = *_7;
> 
> which is safe in middle-end rules though.  So what "saves" us
> here is that we always return a reference and never a value.
> There's the ::iterate member function which fails to do this,
> the ::quick_push function does
> 
>   T *slot = &m_vecdata[m_vecpfx.m_num++];
>   *slot = obj;
> 
> with
> 
> static __attribute__((noipa)) void
> test_auto_alias (int i)
> { 
>   auto_vec<int, 8> v;
>   v.quick_grow (2);
>   v[0] = 1;
>   v[1] = 2;
>   int val;
>   for (int ix = i; v.iterate (ix, &val); ix++)
>     ;
>   ASSERT_EQ (val, 2);
> } 
> 
> I get that optimzied to a FAIL.  I have a "fix" for this.
> unordered_remove has a similar issue accesing the last element.


Turns out forwprop "breaks" this still, so the fix doesn't work.
That means we have a hole here in the middle-end.  We can
avoid this by obfuscating things even more, like to

      const T *first = m_vecdata;
      const T *slot = first + ix;
      *ptr = *slot;

which at least for variable 'ix' avoids forwprop from triggering.

But this also means that the existing [] accessor isn't really safe,
we're just lucky that we turn constant accesses to
MEM[(int &)&v + off] = val; and that we now have PR108355
which made the get_ref_base_and_extent info used less often in VN.

I'm testing the patch now without the new selftest, it should be
good to avoid these issues for constant indexes.  I can also split
the patch up.

But in the end I think we have to fix auto_vec in a better way.

Richard.

> There are a few functions using the [] access member which is
> at least sub-optimal due to repeated bounds checking but also safe.
> 
> I suppose if auto_vec would be a union of vec<vl_embed> and
> a storage member with the vl_embed active that would work, but then
> that's likely not something C++11 supports.
> 
> So I think to support auto_vec we'd need to make the m_vecdata[]
> member in vec<vl_embed> of templated size (defaulted to 1)
> and get rid of the m_data member in auto_vec instead.  Or have
> another C++ way of increasing the size of auto_vec without
> actually adding any member?
> 
> The vec<vl_embed> data accesses then would need to go through
> a wrapper obtaining a correctly typed pointer to m_vecdata[]
> since we'd like to have that as unsigned char[] to avoid the
> initialization.
> 
> > > Yes, I'm not proposing to fix non-POD support.  I want to make
> > > as-if-POD stuff like std::pair to work like it was intended.
> > > 
> > > > Oh, and perhaps we should start marking such spots in GCC with
> > > > strict_flex_array attribute to make it clear where we rely on the
> > > > non-strict behavior.
> > > 
> > > I think we never access the array directly as array, do we?
> > 
> > Sure, the attribute should go to m_vecdata array, not to m_data.
> > And to op array in gimple_statement_with_ops, operands array in
> > operands, ops array in tree_omp_clause, val in tree_int_cst,
> > str in tree_string, elts in tree_vector, a in tree_vec, elem in rtvec_def,
> > elem in hwivec_def, u.{fld,hwint} in rtx_def and various others, we
> > use this stuff everywhere.  Also often use GTY length similarly to the
> > proposed element_count...
> 
> Here's a patch to fix vec.h to adhere to GCC middle-ends interpretation
> of array accesses - &array[i] is just address arithmetic and out-of-bounds
> accesses are OK.  But that doesn't prevent other host compilers from
> miscompiling stage1 I guess, I'm not sure if there's any standard
> conforming way to compute the address of an element in m_data from
> the address of m_vecdata?
> 
> Bootstrap and regtest running on x86_64-unknown-linux-gnu, OK?
> 
> Thanks,
> Richard.
> 
> 
> From b7d47dd1c166d216eba7160f11087312f8b5bbef Mon Sep 17 00:00:00 2001
> From: Richard Biener <rguent...@suse.de>
> Date: Fri, 24 Feb 2023 09:54:09 +0100
> Subject: [PATCH] Fix vec.h alias problems
> To: gcc-patches@gcc.gnu.org
> 
> With auto_vec we have embedded storage that spans two members,
> the m_auto.m_vecdata[1] and the m_data[N-1] arrays and accesses
> to the data are all based on m_auto.m_vecdata[].  That's OK for
> GCC as long as the accesses are done indirectly through a pointer
> since the address computation based on m_auto.m_vecdata[] is
> considered OK by GCC (but not by the language standards).
> 
> The following fixes the few places that failed to enfoce this
> indirection and adds one self-test that failed before.
> 
> As I was here it also fixes ::contains to avoid repeated bounds
> checking and the same issue in :;lower_bound which also suffers
> from unnecessary copying around values.
> 
>       * vec.h (vec<T, A, vl_embed>::lower_bound): Adjust to
>       take a const reference to the object, access m_vecdata
>       directly.
>       (vec<T, A, vl_embed>::contains): Likewise.
>       (vec<T, A, vl_embed>::iterate): Perform accesses to
>       m_vecdata indirectly.
>       (vec<T, A, vl_embed>::unordered_remove): Likewise.
>       * vec.cc (test_auto_alias): New.
>       (vec_cc_tests): Call it.
> ---
>  gcc/vec.cc | 17 +++++++++++++++++
>  gcc/vec.h  | 21 ++++++++++++++-------
>  2 files changed, 31 insertions(+), 7 deletions(-)
> 
> diff --git a/gcc/vec.cc b/gcc/vec.cc
> index 511e6dff50d..afba66cb3d0 100644
> --- a/gcc/vec.cc
> +++ b/gcc/vec.cc
> @@ -568,6 +568,22 @@ test_auto_delete_vec ()
>    ASSERT_EQ (dtor_count, 2);
>  }
>  
> +/* Verify accesses to m_vecdata are done indirectly.  */
> +
> +static void
> +test_auto_alias ()
> +{
> +  volatile int i = 1;
> +  auto_vec<int, 8> v;
> +  v.quick_grow (2);
> +  v[0] = 1;
> +  v[1] = 2;
> +  int val;
> +  for (int ix = i; v.iterate (ix, &val); ix++)
> +    ;
> +  ASSERT_EQ (val, 2);
> +}
> +
>  /* Run all of the selftests within this file.  */
>  
>  void
> @@ -587,6 +603,7 @@ vec_cc_tests ()
>    test_qsort ();
>    test_reverse ();
>    test_auto_delete_vec ();
> +  test_auto_alias ();
>  }
>  
>  } // namespace selftest
> diff --git a/gcc/vec.h b/gcc/vec.h
> index cee96254a31..b3ec977c23a 100644
> --- a/gcc/vec.h
> +++ b/gcc/vec.h
> @@ -614,7 +614,7 @@ public:
>    T *bsearch (const void *key, int (*compar)(const void *, const void *));
>    T *bsearch (const void *key,
>             int (*compar)(const void *, const void *, void *), void *);
> -  unsigned lower_bound (T, bool (*)(const T &, const T &)) const;
> +  unsigned lower_bound (const T &, bool (*)(const T &, const T &)) const;
>    bool contains (const T &search) const;
>    static size_t embedded_size (unsigned);
>    void embedded_init (unsigned, unsigned = 0, unsigned = 0);
> @@ -929,7 +929,8 @@ vec<T, A, vl_embed>::iterate (unsigned ix, T *ptr) const
>  {
>    if (ix < m_vecpfx.m_num)
>      {
> -      *ptr = m_vecdata[ix];
> +      const T *slot = &m_vecdata[ix];
> +      *ptr = *slot;
>        return true;
>      }
>    else
> @@ -1118,7 +1119,9 @@ inline void
>  vec<T, A, vl_embed>::unordered_remove (unsigned ix)
>  {
>    gcc_checking_assert (ix < length ());
> -  m_vecdata[ix] = m_vecdata[--m_vecpfx.m_num];
> +  const T *last_slot = &m_vecdata[--m_vecpfx.m_num];
> +  T *slot = &m_vecdata[ix];
> +  *slot = *last_slot;
>  }
>  
>  
> @@ -1249,8 +1252,11 @@ vec<T, A, vl_embed>::contains (const T &search) const
>  {
>    unsigned int len = length ();
>    for (unsigned int i = 0; i < len; i++)
> -    if ((*this)[i] == search)
> -      return true;
> +    {
> +      const T *slot = &m_vecdata[i];
> +      if (*slot == search)
> +     return true;
> +    }
>  
>    return false;
>  }
> @@ -1262,7 +1268,8 @@ vec<T, A, vl_embed>::contains (const T &search) const
>  
>  template<typename T, typename A>
>  unsigned
> -vec<T, A, vl_embed>::lower_bound (T obj, bool (*lessthan)(const T &, const T 
> &))
> +vec<T, A, vl_embed>::lower_bound (const T &obj,
> +                               bool (*lessthan)(const T &, const T &))
>    const
>  {
>    unsigned int len = length ();
> @@ -1273,7 +1280,7 @@ vec<T, A, vl_embed>::lower_bound (T obj, bool 
> (*lessthan)(const T &, const T &))
>        half = len / 2;
>        middle = first;
>        middle += half;
> -      T middle_elem = (*this)[middle];
> +      const T &middle_elem = this->m_vecdata[middle];
>        if (lessthan (middle_elem, obj))
>       {
>         first = middle;
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)

Re: [PATCH] Avoid default-initializing auto_vec storage

Reply via email to