On 11/10/2024 14:30, Richard Biener wrote:
> On Fri, 11 Oct 2024, Richard Sandiford wrote:
> 
> > Alex Coplan <alex.cop...@arm.com> writes:
> > > Hi,
> > >
> > > As the PR shows, pair-fusion was tricking memory_modified_in_insn_p into
> > > returning false when a common base register (in this case, x1) was
> > > modified between the mem and the store insn.  This lead to wrong code as
> > > the accesses really did alias.
> > >
> > > To avoid this sort of problem, this patch avoids invoking RTL alias
> > > analysis altogether (and assume an alias conflict) if the two insns to
> > > be compared share a common address register R, and the insns see different
> > > definitions of R (i.e. it was modified in between).
> > >
> > > Bootstrapped/regtested on aarch64-linux-gnu (all languages, both regular
> > > bootstrap and LTO+PGO bootstrap).  OK for trunk?
> > 
> > Sorry for the slow review.  The patch looks good to me, but...

Thanks for the review.  I'd missed that you'd sent this, sorry for not
responding sooner.

> > 
> > > @@ -2544,11 +2624,37 @@ pair_fusion_bb_info::try_fuse_pair (bool load_p, 
> > > unsigned access_size,
> > >      && bitmap_bit_p (&m_tombstone_bitmap, insn->uid ());
> > >    };
> > >  
> > > +  // Maximum number of distinct regnos we expect to appear in a single
> > > +  // MEM (and thus in a candidate insn).
> > > +  static constexpr int max_mem_regs = 2;
> > > +  auto_vec<access_info *, max_mem_regs> addr_use_vec[2];
> > > +  use_array addr_uses[2];
> > > +
> > > +  // Collect the lists of register uses that occur in the candidate MEMs.
> > > +  for (int i = 0; i < 2; i++)
> > > +    {
> > > +      // N.B. it's safe for us to ignore uses that only occur in notes
> > > +      // here (e.g. in a REG_EQUIV expression) since we only pass the
> > > +      // MEM down to the alias machinery, so it can't see any insn-level
> > > +      // notes.
> > > +      for (auto use : insns[i]->uses ())
> > > + if (use->is_reg ()
> > > +     && use->includes_address_uses ()
> > > +     && !use->only_occurs_in_notes ())
> > > +   {
> > > +     gcc_checking_assert (addr_use_vec[i].length () < max_mem_regs);
> > > +     addr_use_vec[i].quick_push (use);
> > 
> > ...if possible, I think it would be better to just use safe_push here,
> > without the assert.  There'd then be no need to split max_mem_regs out;
> > it could just be hard-coded in the addr_use_vec declaration.

I hadn't realised at the time that quick_push () already does a
gcc_checking_assert to make sure that we don't overflow.  It does:

  template<typename T, typename A>
  inline T *
  vec<T, A, vl_embed>::quick_push (const T &obj)
  {
    gcc_checking_assert (space (1));
    T *slot = &address ()[m_vecpfx.m_num++];
    ::new (static_cast<void*>(slot)) T (obj);
    return slot;
  }

(I checked the behaviour by writing a quick selftest in vec.cc, and it
indeed aborts as expected with quick_push on overflow for a
stack-allocated auto_vec with N = 2.)

This means that the assert above is indeed redundant, so I agree that
we should be able to drop the assert and drop the max_mem_regs constant,
using a literal inside the auto_vec template instead (all while still
using quick_push).

Does that sound OK to you, or did you have another reason to prefer
safe_push?  AIUI the behaviour of safe_push on overflow would be to
allocate a new (heap-allocated) vector instead of asserting.

> > 
> > Or does that not work for some reason?  I'm getting a sense of deja vu...
> 
> safe_push should work but as I understand the desire is to rely
> on fully on-stack pre-allocated vectors?

Yes, that was indeed the original intent.

Thanks,
Alex

> 
> > If it doesn't work, an alternative would be to use access_array_builder.
> > 
> > OK for trunk and backports if using safe_push works.
> > 
> > Thanks,
> > Richard
> > 
> > > +   }
> > > +      addr_uses[i] = use_array (addr_use_vec[i]);
> > > +    }
> > > +
> > 
> > 
> > >    store_walker<false, decltype(tombstone_p)>
> > > -    forward_store_walker (mem_defs[0], cand_mems[0], insns[1], 
> > > tombstone_p);
> > > +    forward_store_walker (mem_defs[0], cand_mems[0], addr_uses[0], 
> > > insns[1],
> > > +                   tombstone_p);
> > >  
> > >    store_walker<true, decltype(tombstone_p)>
> > > -    backward_store_walker (mem_defs[1], cand_mems[1], insns[0], 
> > > tombstone_p);
> > > +    backward_store_walker (mem_defs[1], cand_mems[1], addr_uses[1], 
> > > insns[0],
> > > +                    tombstone_p);
> > >  
> > >    alias_walker *walkers[4] = {};
> > >    if (mem_defs[0])
> > > @@ -2562,8 +2668,10 @@ pair_fusion_bb_info::try_fuse_pair (bool load_p, 
> > > unsigned access_size,
> > >      {
> > >        // We want to find any loads hanging off the first store.
> > >        mem_defs[0] = memory_access (insns[0]->defs ());
> > > -      load_walker<false> forward_load_walker (mem_defs[0], insns[0], 
> > > insns[1]);
> > > -      load_walker<true> backward_load_walker (mem_defs[1], insns[1], 
> > > insns[0]);
> > > +      load_walker<false> forward_load_walker (mem_defs[0], insns[0],
> > > +                                       addr_uses[0], insns[1]);
> > > +      load_walker<true> backward_load_walker (mem_defs[1], insns[1],
> > > +                                       addr_uses[1], insns[0]);
> > >        walkers[2] = &forward_load_walker;
> > >        walkers[3] = &backward_load_walker;
> > >        m_pass->do_alias_analysis (alias_hazards, walkers, load_p);
> > > diff --git a/gcc/testsuite/g++.dg/torture/pr116783.C 
> > > b/gcc/testsuite/g++.dg/torture/pr116783.C
> > > new file mode 100644
> > > index 00000000000..6d59159459d
> > > --- /dev/null
> > > +++ b/gcc/testsuite/g++.dg/torture/pr116783.C
> > > @@ -0,0 +1,98 @@
> > > +// { dg-do run }
> > > +// { dg-additional-options "-fstack-protector-strong 
> > > -fno-late-combine-instructions" }
> > > +// { dg-require-effective-target fstack_protector }
> > > +// { dg-require-effective-target c++11 }
> > > +
> > > +struct Private {
> > > +  char data[24]{};
> > > +  long moved_from : 4;
> > > +  Private() : moved_from (0) {}
> > > +};
> > > +
> > > +struct QVariant {
> > > +  __attribute__((noipa))
> > > +  ~QVariant() {
> > > +    if (!d.moved_from && d.data[0] != 42)
> > > +      __builtin_abort ();
> > > +  }
> > > +  __attribute__((noipa))
> > > +  QVariant() {
> > > +    d.data[0] = 42;
> > > +  }
> > > +  __attribute__((noipa))
> > > +  QVariant(QVariant &other) : d(other.d) {}
> > > +  QVariant(QVariant &&other) : d(other.d) {
> > > +    other.d = Private();
> > > +    other.d.moved_from = true;
> > > +  }
> > > +  QVariant &operator=(QVariant);
> > > +  Private d;
> > > +};
> > > +
> > > +QVariant id (QVariant v) { return v; }
> > > +QVariant &QVariant::operator=(QVariant other)
> > > +{
> > > +  id(other);
> > > +  return *this;
> > > +}
> > > +
> > > +template <typename T> struct QList {
> > > +  T d;
> > > +  struct const_iterator {
> > > +    T *ptr;
> > > +    T &operator*() { return *ptr; }
> > > +    __attribute__((noipa))
> > > +    bool operator!=(const_iterator other) { return ptr != other.ptr; }
> > > +    void operator++() { ptr++; }
> > > +  };
> > > +  __attribute__((noipa))
> > > +  T at() { return d; }
> > > +  const_iterator begin() { return const_iterator { &d }; }
> > > +  const_iterator end() { return const_iterator { &d }; }
> > > +};
> > > +struct QArrayDataPointer {
> > > +  int d;
> > > +  int *ptr;
> > > +  long size;
> > > +};
> > > +
> > > +QArrayDataPointer null_qadp;
> > > +
> > > +struct QString {
> > > +  __attribute__((noipa))
> > > +  QList<QString> split() const {
> > > +    return QList<QString> {null_qadp};
> > > +  }
> > > +  __attribute__((noipa))
> > > +  friend bool operator==(QString, QString) { return true; }
> > > +
> > > +  __attribute__((noipa))
> > > +  QString(QArrayDataPointer dp) : d(dp) {}
> > > +  QArrayDataPointer d;
> > > +};
> > > +
> > > +__attribute__((noipa))
> > > +QString as_qstr (QVariant *v)
> > > +{
> > > +  return QString (null_qadp);
> > > +}
> > > +
> > > +int *getNode(const QString &tagContent) {
> > > +  auto expr = tagContent.split();
> > > +  auto blockName = expr.at();
> > > +  auto loadedBlocksVariant = QVariant ();
> > > +  QList<QVariant> blockVariantList;
> > > +  for (auto &item : blockVariantList) {
> > > +    auto blockNodeName = as_qstr (&item);
> > > +    blockNodeName == blockName;
> > > +    QString q(null_qadp);
> > > +  }
> > > +  loadedBlocksVariant = QVariant();
> > > +  return nullptr;
> > > +}
> > > +
> > > +int main(void)
> > > +{
> > > +  QString foo(null_qadp);
> > > +  getNode(foo);
> > > +}
> > 
> 
> -- 
> Richard Biener <rguent...@suse.de>
> SUSE Software Solutions Germany GmbH,
> Frankenstrasse 146, 90461 Nuernberg, Germany;
> GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Reply via email to