> You disable fusion for Budozer here sinze you did not add it into > TARGET_FUSE_CMP_AND_BRANCH_64.
Ok, will add it. > > Perhaps we can have TARGET_FUSE_CMP_AND_BRANCH_64 and > TARGET_FUSE_CMP_AND_BRANCH_32 > plus an macro TARGET_FUSE_CMP_AND_BRANCH that chose corresponding variant > based > on TARGET_64BIT rather than having to wind down the test in every use. Ok, will fix it. >> +/* X86_TUNE_FUSE_CMP_AND_BRANCH_SOFLAGS: Fuse compare with a >> + subsequent conditional jump instruction when the condition jump >> + check sign flag (SF) or overflow flag (OF). */ >> +DEF_TUNE (X86_TUNE_FUSE_CMP_AND_BRANCH_SOFLAGS, >> "fuse_cmp_and_branch_soflags", >> + m_COREI7 | m_COREI7_AVX | m_HASWELL) > > This flag is affecting only fuding of ALU and BRANCh or should it also affect > X86_TUNE_FUSE_CMP_AND_BRANCH? In current implementation it seems to be the > first > and in that case it ought to be documented that way and probably > called ALT_AND_BRANCH_SOFLAGS to avoid confussion. > X86_TUNE_FUSE_CMP_AND_BRANCH_SOFLAGS is not affecting fusing ALU and BRANCH. It is added because m_CORE2 doesn't support fusing cmp and JL/JG/JLE/JGE. > This is what Agner Fog says: > > A CMP or TEST instruction immediately followed by a conditional jump can be > fused into a single macro-op. This applies to all versions of the CMP and TEST > instructions and all conditional jumps except if the CMP or TEST has a > rip-relative address or both a displacement and an immediate operand. > > So it is a bit more weird. Perhaps you can extend your predicate to look > for IP relative addresses & displacements of CMP and TEST, too. > > Honza Thanks for checking it. Agner's guide also mentions this constraint for sandybridge, ivybridge.... I missed it because Intel optimization reference manual doesn't mention it. I did some experiment just now and verified the constraint for sandybridge existed. Will add the predicate. Thanks, Wei Mi.