Quoting Richard Sandiford <rdsandif...@googlemail.com>:

It's about describing complex interactions of length adjustments that
derive from branch shortening and length added for (un)alignment for
scheduling purposes.  Expressed naively, these can lead to cycles.

But shorten_branches should be written to avoid cycles, and I thought
your patch did that by making sure that the address of each instruction
only increases.

This actually gives a very poor branch shortening result for ARC.
What I did before with the lock_length attribute was only locking in increases
for the length of branches, and preserving a complex set of invariants to
avoid cycles from the (mis)alignment padding.

Saying that for some instructions, the length locking comes in effect
only after a few iterations of not permanently increasing the size of
other instructions is still cycle-safe.

The length variation for the ARC are not alike: there are branches that
are subject to branch shortening in the usual way, but they might
shrink when other things shrink.  When we are iterating starting at
minimal length and increasing lengths, these branches should be stopped
from shrinking, as they likly will grow back again in the next iteration.
OTOH there are potentially short instructions that are lengthened to
archive scheduling-dictated alignment or misalignment.  These should
be allowed to change freely back and forth.  Unless we have a rare
cycle that only depends on alignments.

Hmm, but this is still talking in terms of shorten_branches, rather than
the target property that we're trying to model.  It sounds like the property
is something along the lines of: some instructions have different encodings
(or work-alikes with different encodings), and we want to make all the
encodings available to the optimisers.  Is that right?

No, not really.  There are lots of redundant encodings, but we have only
one favourite encoding for each size.  We usually want the shortest version,
unless we want to tweak the alignment of a following instruction, or in
the case of unaligned conditional branches with unfilled delay slot, to
hide the branch penalty (don't ask me why it has that effect...).

If so, that sounds
like a useful thing to model, but I think it should be modelled directly,
rather than as a modification to the shorten_branches algorithm.

The shorten_branches algorithm can't give the correct result without
taking these things into account.  Also, instruction lengths are needed
to know the requirements.  So, do you want me to run a scheduling
pass in each iteration of shorten_branches?

Is the alignment/misalignment explicitly modelled in the rtl?  If so, how?
Using labels, or something else?

No, it works with instruction addresses during branch shortening, and
extra bits in cfun->machine during instruction output to keep track of
the current alignment.

Reply via email to