On Wed, 2024-07-31 at 16:57 +0800, Lulu Cheng wrote: > > 在 2024/7/29 下午3:58, Xi Ruoyao 写道: > > Per a gcc-help thread we are generating sub-optimal code for > > __builtin_bswap{32,64}. To fix it: > > > > - Use a single revb.d instruction for bswapdi2. > > - Use a single revb.2w instruction for bswapsi2 for TARGET_64BIT, > > revb.2h + rotri.w for !TARGET_64BIT. > > - Use a single revb.2h instruction for bswapsi2 (x) r>> 16, and a single > > revb.2w instruction for bswapdi2 (x) r>> 32. > > > > Unfortunately I cannot figure out a way to make the compiler generate > > revb.4h or revh.{2w,d} instructions. > > This optimization is really ingenious and I have no problem. > > I also haven't figured out how to generate revb.4h or revh. {2w,d}. > I think we can merge this patch first.
Pushed r15-2433. FWIW I tried a naive pattern for revh.2w: (set (match_operand:DI 0 "register_operand" "=r") (ior:DI (and:DI (ashift:DI (match_operand:DI 1 "register_operand" "r") (const_int 16)) (const_int 18446462603027742720)) (and:DI (lshiftrt:DI (match_dup 1) (const_int 16)) (const_int 281470681808895)))) But it seems too complex to be recognized. -- Xi Ruoyao <xry...@xry111.site> School of Aerospace Science and Technology, Xidian University