https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97387
Jakub Jelinek <jakub at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |jakub at gcc dot gnu.org, | |uros at gcc dot gnu.org --- Comment #5 from Jakub Jelinek <jakub at gcc dot gnu.org> --- I actually think that what we emit for these builtins is right, the problem is that combiner is not able to optimize (insn 10 9 11 2 (set (reg:QI 88 [ _31 ]) (ltu:QI (reg:CCC 17 flags) (const_int 0 [0]))) "include/adxintrin.h":69:10 784 {*setcc_qi} (expr_list:REG_DEAD (reg:CCC 17 flags) (nil))) followed (with instructions that don't clobber flags in between) by: (insn 17 15 18 2 (parallel [ (set (reg:CCC 17 flags) (compare:CCC (plus:QI (reg:QI 88 [ _31 ]) (const_int -1 [0xffffffffffffffff])) (reg:QI 88 [ _31 ]))) (clobber (scratch:QI)) ]) "include/adxintrin.h":69:10 349 {*addqi3_cconly_overflow_1} (expr_list:REG_DEAD (reg:QI 88 [ _31 ]) (nil))) into nothing. It tries that: Trying 10 -> 17: 10: r88:QI=ltu(flags:CCC,0) REG_DEAD flags:CCC 17: {flags:CCC=cmp(r88:QI-0x1,r88:QI);clobber scratch;} REG_DEAD r88:QI Failed to match this instruction: (parallel [ (set (reg:CC 17 flags) (compare:CC (neg:QI (geu:QI (reg:CCC 17 flags) (const_int 0 [0]))) (ltu:QI (reg:CCC 17 flags) (const_int 0 [0])))) (clobber (scratch:QI)) ]) Failed to match this instruction: (set (reg:CC 17 flags) (compare:CC (neg:QI (geu:QI (reg:CCC 17 flags) (const_int 0 [0]))) (ltu:QI (reg:CCC 17 flags) (const_int 0 [0])))) Similarly, the Trying 10, 17 -> 18: 10: r88:QI=ltu(flags:CCC,0) REG_DEAD flags:CCC 17: {flags:CCC=cmp(r88:QI-0x1,r88:QI);clobber scratch;} REG_DEAD r88:QI 18: {flags:CCC=cmp(zero_extend(ltu(flags:CCC,0)+r106:DI+r107:DI),zero_extend(r107:DI)+ltu(flags:CCC,0));r109:DI=ltu(flags:CCC,0)+r106:DI+r107:DI ;} REG_DEAD r107:DI REG_DEAD r106:DI Can't combine i1 into i3 fails because it would want to set flags multiple times and punts because of that. The 10 -> 17 combination seems more promissing, though I'm not sure the CCmode rather than CCCmode in that case is desirable.