[committed][PR tree-optimization/90036] Backpropagate more equivalences in DOM

Jeffrey Law Sat, 28 Feb 2026 07:57:11 -0800

And now the rest of the fix for 90036.  Two changes of note.

First, when recording temporary equivalences from the edge info cache,if the equivalence has the form:


[0/1] = A EQ/NE CST;

Go ahead and backprop that equivalence to the uses of A.

Concretely from the BZ we have this:

;;   basic block 2, loop depth 0
;;    pred:       ENTRY
  _1 = vptr_14(D) == 0;
  _2 = ownvptr_15(D) != 0;
  _3 = _1 | _2;
  if (_3 != 0)
    goto <bb 4>; [67.00%]
  else
    goto <bb 3>; [33.00%]
;;    succ:       4
;;                3

[ ... ]

;;   basic block 4, loop depth 0
;;    pred:       2
  if (vptr_14(D) != 0)
    goto <bb 5>; [25.37%]
  else
    goto <bb 10>; [74.63%]
;;    succ:       5
;;                10

;;   basic block 5, loop depth 0
;;    pred:       4
  # definition_10 = PHI <0(4)>
  # vstring_9 = PHI <0B(4)>
  if (ownvptr_15(D) != 0)
    goto <bb 7>; [80.00%]
  else
    goto <bb 6>; [20.00%]
;;    succ:       7
;;                6

;;   basic block 6, loop depth 0
;;    pred:       5
  sprintf (p_13(D), "~%%%s", vstring_9);
  goto <bb 10>; [100.00%]
;;    succ:       10

So when DOM discovers the edge equivalence for vptr_14 on the 4->5 edgeDOM now back-propagates the value to the uses of vptr_14, particularlyuses in bb#1. That allows us to discover a simple equivalence for _1which is a key nugget to unlocking this BZ.


1 = (_3 != 0) by way of traversing the 2->4 edge.
1 = (vptr_14 != 0) by way of traversing the 4->5 edge
_1 = 0 by backproping the state of vptr_14 to use point in bb1

The last step is to back-propagate the _1 = 0 equivalence to the usepoints of _1 in bb1. In particular to this statement


_3 = _1 | _2;

If we know that _1 == 0, then _3 and _2 must have the same values(nonzero in this case).


_2 = ownvptr_15(D) != 0;

Since we know the state of _2, we can compute the state of ownvptr_15. Which was the goal. We're still on the 4->5 edge, but we've managed tocompute an equivalence for ownvptr_15 which in turn allows us to knowhow the branch at the end of bb5 will go.

Note this is not jump threading. It's a conditional equivalence withback propagation.

The additional lookups in the hash table trigger messages in the dumpfile. The unconstrained_commons.f test scans the DOM dump file toensure certain messages never appear. That scan test is now bogus. Thetest has other things it checks to ensure DOM hasn't done anythingwrong. So that one scan test in unconstrained_commons.f has been removed.

Bootstrapped and regression tested on x86, armv7, loongarch64, riscv64. Regression tested on the usual crosses as well.

commit 55c6baeb86b10912e98f4cf6b0a432d7c896d81e
Author: Jeff Law <[email protected]>
Date:   Sun Feb 22 09:26:38 2026 -0700

    [1/n][PR tree-optimization/90036] All refinement of entries in DOM hash 
table
    
    This is the first of a few patches to fix pr90036.
    
    I've gone back and forth about whether or not to fix this for gcc-16 or 
queue
    for gcc-17.  Ultimately I don't think these opportunities are *that* 
common, so
    I don't expect widespread code generation changes.
    
    I'm going to drop the changes in a small series as the changes stand on 
their
    own.  This gives us better bisectability.
    
    --
    
    The first patch allows refinement of existing equivalences in a case where 
we'd
    missed it before.  In particular say we have <res> = <expr> in the 
expression
    hash table.  We later use <expr> in a way that creates a temporary 
expression
    equivalence.  We'll fail to record that temporary expression equivalence
    because of the pre-existing entry in the hash table.
    
    And just to be clear, the old equivalence will be restored when we leave the
    domwalk scope of the newer, more precise, hash table entry.
    
    This matters for pr90036 as we initially enter a simple equivalence in the
    table with the result being an SSA_NAME.  Later we have a conditional that
    allows us to refine the result to a constant.  And we're going to need that
    constant result to trigger additional simplifications and equivalence
    discovery.
    
    Bootstrapped and regression tested on x86_64, aarch64, riscv64 and probably 
a
    couple others as well.  It's also been tested across the embedded targets 
in my
    tester.  Pushing to the trunk.
    
            PR tree-optimization/90036
    gcc/
            * tree-ssa-scopedtables.cc (avail_exprs_stack::record_cond): Always
            record the new hash table entry.

diff --git a/gcc/tree-ssa-scopedtables.cc b/gcc/tree-ssa-scopedtables.cc
index 828f214c7cb..95523b23478 100644
--- a/gcc/tree-ssa-scopedtables.cc
+++ b/gcc/tree-ssa-scopedtables.cc
@@ -392,13 +392,13 @@ avail_exprs_stack::record_cond (cond_equivalence *p)
   expr_hash_elt **slot;
 
   slot = m_avail_exprs->find_slot_with_hash (element, element->hash (), 
INSERT);
-  if (*slot == NULL)
-    {
-      *slot = element;
-      record_expr (element, NULL, '1');
-    }
-  else
-    delete element;
+
+  /* We will always get back a valid slot in the hash table.  Go ahead and
+     record the new equivalence.  While it may be overwriting something older,
+     the belief is that the newer equivalence is more likely to be useful as
+     it was derived using more information/context.  */
+  record_expr (element, *slot, '1');
+  *slot = element;
 }
 
 /* Generate a hash value for a pair of expressions.  This can be used

[committed][PR tree-optimization/90036] Backpropagate more equivalences in DOM

Reply via email to