On Wed, 2024-07-31 at 16:57 +0800, Lulu Cheng wrote:
> 
> 在 2024/7/29 下午3:58, Xi Ruoyao 写道:
> > Per a gcc-help thread we are generating sub-optimal code for
> > __builtin_bswap{32,64}.  To fix it:
> > 
> > - Use a single revb.d instruction for bswapdi2.
> > - Use a single revb.2w instruction for bswapsi2 for TARGET_64BIT,
> >     revb.2h + rotri.w for !TARGET_64BIT.
> > - Use a single revb.2h instruction for bswapsi2 (x) r>> 16, and a single
> >     revb.2w instruction for bswapdi2 (x) r>> 32.
> > 
> > Unfortunately I cannot figure out a way to make the compiler generate
> > revb.4h or revh.{2w,d} instructions.
> 
> This optimization is really ingenious and I have no problem.
> 
> I also haven't figured out how to generate revb.4h or revh. {2w,d}.
> I think we can merge this patch first.

Pushed r15-2433.

FWIW I tried a naive pattern for revh.2w:

(set (match_operand:DI 0 "register_operand" "=r")
     (ior:DI
       (and:DI
         (ashift:DI (match_operand:DI 1 "register_operand" "r")
                    (const_int 16))
         (const_int 18446462603027742720))
       (and:DI
         (lshiftrt:DI (match_dup 1)
                      (const_int 16))
         (const_int 281470681808895))))

But it seems too complex to be recognized.

-- 
Xi Ruoyao <xry...@xry111.site>
School of Aerospace Science and Technology, Xidian University

Reply via email to