Am 19.07.24 um 16:56 schrieb Jeff Law:
On 7/18/24 3:12 PM, Georg-Johann Lay wrote:
This new builtin provides a faster way to compute
expressions like 1 << x or ~(1 << x) that are sometimes
used as masks for setting bits in the I/O region, and when
x is not known at compile-time.
The open coded C expression takes 5 + 4 * x cycles to compute an
8-bit result, whereas the builtin takes only 7 cycles independent
of x.
The implementation is straight forward and uses 3 new insns.
Ok for trunk?
Johann
--
AVR: Support new built-in function __builtin_avr_mask1.
gcc/
* config/avr/builtins.def (MASK1): New DEF_BUILTIN.
* config/avr/avr.cc (avr_rtx_costs_1): Handle rtx costs for
expressions like __builtin_avr_mask1.
(avr_init_builtins) <uintQI_ftype_uintQI_uintQI>: New tree type.
(avr_expand_builtin) [AVR_BUILTIN_MASK1]: Diagnose unexpected forms.
(avr_fold_builtin) [AVR_BUILTIN_MASK1]: Handle case.
* config/avr/avr.md (gen_mask1): New expand helper.
(mask1_0x01_split, mask1_0x80_split, mask1_0xfe_split): New
insn-and-split.
(*mask1_0x01, *mask1_0x80, *mask1_0xfe): New insns.
* doc/extend.texi (AVR Built-in Functions) <__builtin_avr_mask1>:
Document new built-in function.
gcc/testsuite/
* gcc.target/avr/torture/builtin-mask1.c: New test.
OK from a technical standpoint. Just a few nits.
[...]
I'm not sure the builtin is strictly needed since this could likely be
handled in the shift expanders and/or combiner patterns. But I've got
no objection to adding the builtin.
jeff
Hi Jeff,
at least combiner patterns won't work. For something like
var |= 1 << (off & 7)
insn combine is just getting lost; it tries expressions with
MEM, IOR, even PARALLELs, but nothing that's close to a rotation.
Also it doesn't break out memory regerences (presumably because it
has no scratch reg available). It is getting even worse for
var &= ~ (1 << (off & 7))
avr.md has rotlqi3 but no tries are ever made to emit
something like rotl (1, off) or rotr (0x80, off).
Johann