Hello! This PR again exposes the problem when STOS instruction is generated from the combine pass. The insn is not generated from corresponding expander, so ix86_current_function_needs_cld flag never gets set. Consequently, the CLD insn is not emitted in the prologue.
The problematic combination is prevented by Jakub's PR55686 patch in 4.8+ branches, so attached patch backports it to 4.7 branch. 2013-05-13 Uros Bizjak <ubiz...@gmail.com> PR target/57264 Backport from mainline 2013-01-22 Jakub Jelinek <ja...@redhat.com> PR target/55686 * config/i386/i386.md (UNSPEC_STOS): New. (strset_singleop, *strsetdi_rex_1, *strsetsi_1, *strsethi_1, *strsetqi_1): Add UNSPEC_STOS. testsuite/ChangeLog: 2013-05-13 Uros Bizjak <ubiz...@gmail.com> PR target/57264 * gcc.target/i386/pr57264.c: New test. Patch was tested on x86_64-pc-linux-gnu {-m32} on 4.7 branch, and committed to 4.7 branch. The testcase will be forward ported to 4.8 and mainline SVN. Uros.
Index: config/i386/i386.md =================================================================== --- config/i386/i386.md (revision 198835) +++ config/i386/i386.md (working copy) @@ -109,6 +109,7 @@ UNSPEC_CALL_NEEDS_VZEROUPPER UNSPEC_PAUSE UNSPEC_LEA_ADDR + UNSPEC_STOS ;; For SSE/MMX support: UNSPEC_FIX_NOTRUNC @@ -15912,7 +15913,8 @@ [(parallel [(set (match_operand 1 "memory_operand" "") (match_operand 2 "register_operand" "")) (set (match_operand 0 "register_operand" "") - (match_operand 3 "" ""))])] + (match_operand 3 "" "")) + (unspec [(const_int 0)] UNSPEC_STOS)])] "" "ix86_current_function_needs_cld = 1;") @@ -15921,7 +15923,8 @@ (match_operand:DI 2 "register_operand" "a")) (set (match_operand:DI 0 "register_operand" "=D") (plus:DI (match_dup 1) - (const_int 8)))] + (const_int 8))) + (unspec [(const_int 0)] UNSPEC_STOS)] "TARGET_64BIT && !(fixed_regs[AX_REG] || fixed_regs[DI_REG])" "stosq" @@ -15934,7 +15937,8 @@ (match_operand:SI 2 "register_operand" "a")) (set (match_operand:P 0 "register_operand" "=D") (plus:P (match_dup 1) - (const_int 4)))] + (const_int 4))) + (unspec [(const_int 0)] UNSPEC_STOS)] "!(fixed_regs[AX_REG] || fixed_regs[DI_REG])" "stos{l|d}" [(set_attr "type" "str") @@ -15946,7 +15950,8 @@ (match_operand:HI 2 "register_operand" "a")) (set (match_operand:P 0 "register_operand" "=D") (plus:P (match_dup 1) - (const_int 2)))] + (const_int 2))) + (unspec [(const_int 0)] UNSPEC_STOS)] "!(fixed_regs[AX_REG] || fixed_regs[DI_REG])" "stosw" [(set_attr "type" "str") @@ -15958,7 +15963,8 @@ (match_operand:QI 2 "register_operand" "a")) (set (match_operand:P 0 "register_operand" "=D") (plus:P (match_dup 1) - (const_int 1)))] + (const_int 1))) + (unspec [(const_int 0)] UNSPEC_STOS)] "!(fixed_regs[AX_REG] || fixed_regs[DI_REG])" "stosb" [(set_attr "type" "str") Index: testsuite/gcc.target/i386/pr57264.c =================================================================== --- testsuite/gcc.target/i386/pr57264.c (revision 0) +++ testsuite/gcc.target/i386/pr57264.c (working copy) @@ -0,0 +1,18 @@ +/* { dg-do compile } */ +/* { dg-options "-O1 -mcld" } */ + +void test (int x, int **pp) +{ + while (x) + { + int *ip = *pp; + int *op = *pp; + while (*ip) + { + int v = *ip++; + *op++ = v + 1; + } + } +} + +/* { dg-final { scan-assembler-not "stosl" } } */