Ping!

Please review. 

Thanks & Regards
Kishan

On 09/09/25 2:04 pm, Kishan Parmar wrote:
> Hi All,
>
> The fallback expansion of __builtin_bswap16 on pre-Power10 used a
> sequence of multiple rlwinm/or instructions:
>
>   mr      r9,r3
>   rlwinm  r3,r9,24,24,31
>   rlwinm  r10,r9,8,16,23
>   or      r3,r3,r10
>   rlwinm  r3,r3,0,0xffff
>
> This was functionally correct but less optimal.
>
> Rewrite the splitter to use a rotate-insert idiom, producing:
>
>   mr      r9,r3
>   slwi    r3,r9,8
>   rlwimi  r3,r9,24,24,31
>   rlwinm  r3,r3,0,0xffff
>
> This sequence is shorter, maps directly to the rlwimi instruction.
>
> The following patch has been bootstrapped on powerpc64le-linux.
>
> 2025-09-09  Kishan Parmar  <[email protected]>
>
> gcc/
>       PR target/121076
>       * config/rs6000/rs6000.md (bswaphi2_reg): Replace multi-instruction
>       rotate-mask/rotate-mask/or  sequence with  shift/rotate-mask-insert
>       idiom, reducing insn count for bswap16 on pre-Power10 targets.
> ---
>  gcc/config/rs6000/rs6000.md | 17 +++++++----------
>  1 file changed, 7 insertions(+), 10 deletions(-)
>
> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
> index 04a6c0f7461..7c48cb900b6 100644
> --- a/gcc/config/rs6000/rs6000.md
> +++ b/gcc/config/rs6000/rs6000.md
> @@ -2676,21 +2676,18 @@
>     xxbrh %x0,%x1"
>    "reload_completed && !TARGET_POWER10 && int_reg_operand (operands[0], 
> HImode)"
>    [(set (match_dup 3)
> -     (and:SI (lshiftrt:SI (match_dup 4)
> -                          (const_int 8))
> -             (const_int 255)))
> -   (set (match_dup 2)
> -     (and:SI (ashift:SI (match_dup 4)
> -                        (const_int 8))
> -             (const_int 65280)))             ;; 0xff00
> +     (ashift:SI (match_dup 4)
> +                (const_int 8)))
>     (set (match_dup 3)
> -     (ior:SI (match_dup 3)
> -             (match_dup 2)))]
> +     (ior:SI (and:SI (match_dup 3)
> +                     (const_int -256))
> +             (and:SI (lshiftrt:SI (match_dup 4) (const_int 8))
> +                     (const_int 255))))]         ;; 0x00ff
>  {
>    operands[3] = simplify_gen_subreg (SImode, operands[0], HImode, 0);
>    operands[4] = simplify_gen_subreg (SImode, operands[1], HImode, 0);
>  }
> -  [(set_attr "length" "*,12,*")
> +  [(set_attr "length" "*,8,*")
>     (set_attr "type" "shift,*,vecperm")
>     (set_attr "isa" "p10,*,p9v")])
>  

Reply via email to