On 7/18/24 3:12 PM, Georg-Johann Lay wrote:
This new builtin provides a faster way to compute
expressions like  1 << x  or ~(1 << x) that are sometimes
used as masks for setting bits in the I/O region, and when
x is not known at compile-time.

The open coded C expression takes 5 + 4 * x cycles to compute an
8-bit result, whereas the builtin takes only 7 cycles independent
of x.

The implementation is straight forward and uses 3 new insns.

Ok for trunk?

Johann

--

AVR: Support new built-in function __builtin_avr_mask1.

gcc/
     * config/avr/builtins.def (MASK1): New DEF_BUILTIN.
     * config/avr/avr.cc (avr_rtx_costs_1): Handle rtx costs for
     expressions like __builtin_avr_mask1.
     (avr_init_builtins) <uintQI_ftype_uintQI_uintQI>: New tree type.
     (avr_expand_builtin) [AVR_BUILTIN_MASK1]: Diagnose unexpected forms.
     (avr_fold_builtin) [AVR_BUILTIN_MASK1]: Handle case.
     * config/avr/avr.md (gen_mask1): New expand helper.
     (mask1_0x01_split, mask1_0x80_split, mask1_0xfe_split): New
     insn-and-split.
     (*mask1_0x01, *mask1_0x80, *mask1_0xfe): New insns.
     * doc/extend.texi (AVR Built-in Functions) <__builtin_avr_mask1>:
     Document new built-in function.
gcc/testsuite/
     * gcc.target/avr/torture/builtin-mask1.c: New test.
OK from a technical standpoint.  Just a few nits.



builtin-mask1.diff

diff --git a/gcc/config/avr/avr.cc b/gcc/config/avr/avr.cc
index b9064424ffe..d8417c4ba3e 100644
--- a/gcc/config/avr/avr.cc
+++ b/gcc/config/avr/avr.cc
@@ -13135,6 +13135,16 @@ avr_rtx_costs_1 (rtx x, machine_mode mode, int 
outer_code,
        switch (mode)
        {
        case E_QImode:
+         if (speed
+             && XEXP (x, 0) == const1_rtx
CONST1_RTX (mode) is probably more appropriate.


+             && AND == GET_CODE (XEXP (x, 1)))
We generally write conditions with the constant as the 2nd operand.  So
GET_CODE (XEXP (x, 1) == AND



@@ -13308,6 +13318,15 @@ avr_rtx_costs_1 (rtx x, machine_mode mode, int 
outer_code,
          break;
case E_HImode:
+         if (CONST_INT_P (XEXP (x, 0))
+             && INTVAL (XEXP (x, 0)) == 128
+             && AND == GET_CODE (XEXP (x, 1)))
Similarly for this condition.



+    switch (0xff & INTVAL (operands[1]))
Similarly here.


+    {
+      case 0x01: emit (gen_mask1_0x01_split (operands[0], operands[2])); break;
+      case 0x80: emit (gen_mask1_0x80_split (operands[0], operands[2])); break;
+      case 0xfe: emit (gen_mask1_0xfe_split (operands[0], operands[2])); break;
Don't put code on the same line as the case. Bring it down to the next line and indent 2 positions further.

Those are all NFC, so approved with those changes.

I'm not sure the builtin is strictly needed since this could likely be handled in the shift expanders and/or combiner patterns. But I've got no objection to adding the builtin.

jeff

Reply via email to