On 1/3/25 11:46 AM, Richard Sandiford wrote:
Jeff Law <jeffreya...@gmail.com> writes:
This resurrects a patch from a bit over 2 years ago that I never wrapped
up. IIRC, I ended up up catching covid, then in the hospital for an
unrelated issue and it just got dropped on the floor in the insanity.
The basic idea here is to help postreload-cse eliminate more
const/copies by recording a small set of conditional equivalences (as
Richi said in 2022, "Ick").
It was originally to help eliminate an unnecessary constant load I saw
in coremark, but as seen in BZ107455 the same issues show up in real
code as well.
Richard B and Richard S both had good comments last time around and
their requests are reflected in this update:
- Use rtx_equal_p rather than pointer equality
- Restrict to register "destinations"
- Restrict to integer modes
- Adjust entry block handling
My own wider scale testing resulted in a few more changes.
- Robustify extracting the (set (pc) ... ), which then required ...
- Handle if src/dst are clobbered by the conditional branch
- Fix logic error causing too many equivalences to be recorded
Heh. I delayed reviewing this for a bit because I have no memory
of seeing it before (sign I'm getting old). I haven't found my
previous comments, so sorry if this completely contradicts what
I said before...
Happens to me all the time. I've even had cases where I found my
comments and have no recollection of them at all. It's depressing.
@@ -221,13 +222,116 @@ reload_cse_regs_1 (void)
init_alias_analysis ();
FOR_EACH_BB_FN (bb, cfun)
- FOR_BB_INSNS (bb, insn)
- {
- if (INSN_P (insn))
- cfg_changed |= reload_cse_simplify (insn, testreg);
+ {
+ /* If BB has a small number of predecessors, see if each of the
+ has the same implicit set. If so, record that implicit set so
+ that we can add it to the cselib tables. */
+ rtx_insn *implicit_set;
- cselib_process_insn (insn);
- }
+ implicit_set = NULL;
+ if (EDGE_COUNT (bb->preds) <= 3)
+ {
+ edge e;
+ edge_iterator ei;
+ rtx src = NULL_RTX;
+ rtx dest = NULL_RTX;
+
+ /* Iterate over each incoming edge and see if they
+ all have the same implicit set. */
+ FOR_EACH_EDGE (e, ei, bb->preds)
+ {
+ /* If the predecessor does not end in a conditional
+ jump, then it does not have an implicit set. */
+ if (e->src == ENTRY_BLOCK_PTR_FOR_FN (cfun)
+ || !block_ends_with_condjump_p (e->src))
AFAICT, this is directly equivalent to:
any_condjump_p (BB_END (e->src))
as checked below, so I think we should just check that (trusting
that BB_END is nonnull if e->src isn't the entry block). In particular...
Agreed.
+ break;
+
+ /* We know the predecessor ends with a conditional
+ jump. Now dig into the actal form of the jump
+ to potentially extract an implicit set. */
+ rtx_insn *condjump = BB_END (e->src);
+ if (!condjump
+ || ! any_condjump_p (condjump)
+ || ! onlyjump_p (condjump))
+ break;
...the onlyjump_p doesn't seem necessary given the later reg_set_p.
Instead we can use...
Also agreed.
+
+ /* This predecessor ends with a possible equivalence
+ producing conditional branch. Extract the condition
+ and try to use it to create an equivalence. */
+ rtx pat = single_set (condjump);
...pc_set instead of single_set here.
Agreed that's clearer on intent.
+ rtx i_t_e = SET_SRC (pat);
+ gcc_assert (GET_CODE (i_t_e) == IF_THEN_ELSE);
+ rtx cond = XEXP (i_t_e, 0);
+ if ((GET_CODE (cond) == EQ
+ && GET_CODE (XEXP (i_t_e, 1)) == LABEL_REF
+ && XEXP (XEXP (i_t_e, 1), 0) == BB_HEAD (bb))
+ || (GET_CODE (cond) == NE
+ && XEXP (i_t_e, 2) == pc_rtx
+ && e->src->next_bb == bb))
How about using:
if (((e->flags & EDGE_FALLTHRU) != 0)
== (XEXP (i_t_e, 1) == pc_rtx)
? GET_CODE (cond) == EQ
: GET_CODE (cond) == NE)
(borrowing a construct from cfgcleanup.cc). That seems more general,
since some targets allow (if_then_else ... (pc) (label_ref)).
It's also a bit shorter.
Wow. I had to read that a few times, but yea, it does appear to be
correct to me. I added a level of parenthesis which helps clarify
things ever-to-slightly IMHO.
BTW, at some point in the past I played around with supporting the
reversed pc/label_ref on a port (probably the H8). I was surprised at
how rarely the reversed case came up in practice.
+ {
+ /* If this is the first time through record
+ the source and destination. */
+ if (!dest)
+ {
+ dest = XEXP (cond, 0);
+ src = XEXP (cond, 1);
+ }
+ /* If this is not the first time through, then
+ verify the source and destination match. */
+ else if (REG_P (dest)
+ && REG_P (src)
This prevents a match in the CONST_INT_P case allowed below. It might
be worth dropping these two conditions and leaving rtx_equal_p to do
its thing.
Totally agreed. I wonder if I added those checks when I took the patch
over to x86 for bootstrap testing and never went back and checked that
the testcases worked on risc-v after making some adjustments.
+ /* A few more checks. First make sure we're dealing with
+ integer modes, second make sure the values aren't clobbered
+ by the conditional branch itself. Do this for every
+ conditional jump participating in creation of the
+ equivalence. */
+ if (!REG_P (dest)
+ || !(REG_P (src) || CONST_INT_P (src))
+ || GET_MODE_CLASS (GET_MODE (dest)) != MODE_INT
This feels unnecessarily restrictive: it looks like it should work
for any mode, and for any type of constant. But I agree that
reload_cse_simplify probably isn't going to do much for other cases
and so it's probably not worth opening an extra avenue for bugs.
Agreed. It might work for something like returning a constant FP value
or a constant vector. I hadn't tested those cases and was being
conservative as a result.
I'll respin and repost. Thanks!
jeff