https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95405

Gabriel Ravier <gabravier at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |gabravier at gmail dot com

--- Comment #2 from Gabriel Ravier <gabravier at gmail dot com> ---
Welp, I've tried to convert this to a simplified form, but I can't seem to get
the same output regardless of how close I get in terms of GIMPLE output.

With this code:

struct opbeb {};

union opbs {
        opbeb empty_byte;
        long value;
};

struct opb {
        opbs payload;
        bool engaged;
};

struct op : public opb {
};

struct ob {
        op payload;
};

struct o {
        ob base;
};

o foo();

long bar()
{
        struct o r = foo();
        if (__builtin_expect_with_probability((*(const ob *)&r).payload.engaged
!= 0, 1, .66))
                return (long &)*(long *)&r;
        else
                return 0;
}

I get this final GIMPLE (i.e. -fdump-tree-optimized):

;; Function bar (_Z3barv, funcdef_no=9255, decl_uid=109154, cgraph_uid=6606,
symbol_order=6814)

Removing basic block 5
long int bar ()
{
  struct o r;
  bool _1;
  long int _4;
  long int _7;

  <bb 2> [local count: 1073741824]:
  r = foo ();
  _1 = MEM[(const struct ob *)&r].payload.D.109140.engaged;
  if (_1 != 0)
    goto <bb 3>; [66.00%]
  else
    goto <bb 4>; [34.00%]

  <bb 3> [local count: 708669601]:
  _7 = MEM[(long int &)&r];

  <bb 4> [local count: 1073741824]:
  # _4 = PHI <_7(3), 0(2)>
  r ={v} {CLOBBER};
  return _4;

}

Which seems to be almost exactly identical to the one I get from the real
std::optional:

;; Function bar (_Z3barv, funcdef_no=6084, decl_uid=49565, cgraph_uid=5869,
symbol_order=5916)

Removing basic block 5
long int bar ()
{
  struct optional r;
  long int _1;
  bool _4;
  long int _5;

  <bb 2> [local count: 1073741824]:
  r = foo ();
  _4 = MEM[(const struct _Optional_base *)&r]._M_payload.D.50442._M_engaged;
  if (_4 != 0)
    goto <bb 3>; [66.00%]
  else
    goto <bb 4>; [34.00%]

  <bb 3> [local count: 708669601]:
  _5 = MEM[(long int &)&r];

  <bb 4> [local count: 1073741824]:
  # _1 = PHI <_5(3), 0(2)>
  r ={v} {CLOBBER};
  return _1;

}

Literally the only differences I can see is that variables are declared in a
different order, and that some variable names are different.

Yet the assembly output for my version optimizes the store to memory away just
fine, and the std::optional output still fails to optimize the store to memory.

Is the (very minor) difference here this significant or is there something I
can't see in the outputted GIMPLE that results in the differences ? I tried to
delve into the RTL, though I failed to really understand what was going on
(though I could see significant differences between what I wrote and the
original example there).
I've also checked the assembly, and as far as I can see, there is no functional
difference between what I wrote and the original one, LLVM even produces the
exact same assembly for both.

I've also tried to rule out the difference in variable declaration placement
and  naming by rewriting what I wrote into GIMPLE and modifying it to
correspond to the original example as well as possible, with this being my best
effort:

long int __GIMPLE (ssa,guessed_local(1073741824))
bar ()
{
  struct o r;
  long int _1;
  bool _4;
  long int _7;

  __BB(2,guessed_local(1073741824)):
  r = foo ();
  _4 = __MEM <const struct ob> ((const struct ob *)&r).payload.base.engaged;
  if (_4 != _Literal (bool) 0)
    goto __BB3(guessed(88583700));
  else
    goto __BB4(guessed(45634028));

  __BB(3,guessed_local(708669601)):
  _7 = __MEM <long int> (&r);
  goto __BB4(precise(134217728));

  __BB(4,guessed_local(1073741824)):
  _1 = __PHI (__BB3: _7, __BB2: 0l);
  r = _Literal (struct o) {};
  return _1;

}

But it still gets optimized well, as expected, unlike the original, which is
rather mind boggling to me, unless there really is a bunch of GIMPLE
information that isn't part of the outputted form.

PS: LLVM optimizes the original example and what I wrote perfectly fine to the
same assembly code.

Reply via email to