I'm a bit puzzled by the decrement_and_branch_until_zero looping pattern. The manual described it as a named pattern, through from the description it isn't clear that it's referenced by name. I see those only in m68k and pa. There are similar looking but anonymous patterns in pdp11 and vax, suggesting that those were meant to be recognized by their structure.
One puzzle is that the body of gcc doesn't reference that pattern name as far as I can see. The other puzzle is that I see no sign that the pattern works. I made up my own simple test file and I can't get pdp11, vax, or m68k to generate a loop using that pattern. Stranger yet, there is a test case gcc.c-torture/execution/dbra-1.c -- a name that suggests it's meant to test this mechanism because dbra is the m68k name for the relevant instruction. That test case doesn't generate these looping instructions either (I tried those also with m68k, vax, pdp11). Finally, I tried that file with an old 4.8.0 build for pdp11 I happened to have lying around. None of these seem to use that loop optimization, with -O2 or -Os. Did I miss some magic switch to turn on something else that isn't on by default? Or is this a feature that was broken long ago and not noticed? If so, any hints where I might look for a reason? paul