On Fri, Jul 14, 2023 at 11:27 AM Roger Sayle <ro...@nextmovesoftware.com> wrote: > > > > From: Uros Bizjak <ubiz...@gmail.com> > > Sent: 13 July 2023 19:21 > > > > On Thu, Jul 13, 2023 at 7:10 PM Roger Sayle <ro...@nextmovesoftware.com> > > wrote: > > > > > > This patch resolves PR target/110588 to catch another case in combine > > > where the i386 backend should be generating a btl instruction. This > > > adds another define_insn_and_split to recognize the RTL representation > > > for this case. > > > > > > I also noticed that two related define_insn_and_split weren't using > > > the preferred string style for single statement > > > preparation-statements, so I've reformatted these to be consistent in > > > style with > > the new one. > > > > > > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap > > > and make -k check, both with and without --target_board=unix{-m32} > > > with no new failures. Ok for mainline? > > > > > > > > > 2023-07-13 Roger Sayle <ro...@nextmovesoftware.com> > > > > > > gcc/ChangeLog > > > PR target/110588 > > > * config/i386/i386.md (*bt<mode>_setcqi): Prefer string form > > > preparation statement over braces for a single statement. > > > (*bt<mode>_setncqi): Likewise. > > > (*bt<mode>_setncqi_2): New define_insn_and_split. > > > > > > gcc/testsuite/ChangeLog > > > PR target/110588 > > > * gcc.target/i386/pr110588.c: New test case. > > > > +;; Help combine recognize bt followed by setnc (PR target/110588) > > +(define_insn_and_split "*bt<mode>_setncqi_2" > > + [(set (match_operand:QI 0 "register_operand") (eq:QI > > + (zero_extract:SWI48 > > + (match_operand:SWI48 1 "register_operand") > > + (const_int 1) > > + (zero_extend:SI (match_operand:QI 2 "register_operand"))) > > + (const_int 0))) > > + (clobber (reg:CC FLAGS_REG))] > > + "TARGET_USE_BT && ix86_pre_reload_split ()" > > + "#" > > + "&& 1" > > + [(set (reg:CCC FLAGS_REG) > > + (compare:CCC > > + (zero_extract:SWI48 (match_dup 1) (const_int 1) (match_dup 2)) > > + (const_int 0))) > > + (set (match_dup 0) > > + (ne:QI (reg:CCC FLAGS_REG) (const_int 0)))] > > + "operands[2] = lowpart_subreg (SImode, operands[2], QImode);") > > > > I don't think the above transformation is 100% correct, mainly due to the > > use of > > paradoxical subreg. > > > > The combined instruction is operating with a zero_extended QImode register, > > so > > all bits of the register are well defined. You are splitting using > > paradoxical subreg, > > so you don't know what garbage is there in the highpart of the count > > register. > > However, BTL/BTQ uses modulo 64 (or 32) of this register, so even with a > > slightly > > invalid RTX, everything checks out. > > > > + "operands[2] = lowpart_subreg (SImode, operands[2], QImode);") > > > > You probably need <SWI48>mode instead of SImode here. > > The define_insn for *bt is: > > (define_insn "*bt<mode>" > [(set (reg:CCC FLAGS_REG) > (compare:CCC > (zero_extract:SWI48 > (match_operand:SWI48 0 "nonimmediate_operand" "r,m") > (const_int 1) > (match_operand:SI 1 "nonmemory_operand" "r<S>,<S>")) > (const_int 0)))] > > So <SWI48> isn't appropriate here. > > But now you've made me think about it, it's inconsistent that all of the > shifts > and rotates in i386.md standardize on QImode for shift counts, but the bit > test > instructions use SImode? I think this explains where the paradoxical SUBREGs > come from, and in theory any_extend from QImode to SImode here could/should > be handled/unnecessary. > > Is it worth investigating a follow-up patch to convert all ZERO_EXTRACTs and > SIGN_EXTRACTs in i386.md to use QImode (instead of SImode)?
IIRC, zero_extract was moved from modeless to a pattern with defined mode a while ago. Perhaps SImode is just because of these ancient times, and BT pattern was written that way to satisfy combine. I think it is definitely worth investigating; perhaps some BT-related pattern will become obsolete because of the change. Uros.