As suggested by Segher, we have to use code iterator to iterate RTX pattern through zero- and sign-extract for 8 bit inserts. In a similar way, we can use any_shiftrt in a RTX pattern involving 8-bit inserts.
While it would be nice to have a middle-end perform relevant simplifications, these two places are all that matters for x86. 2017-01-18 Uros Bizjak <ubiz...@gmail.com> PR rtl-optimization/78952 * config/i386/i386.md (any_extract): New code iterator. (*insvqi_2): Use any_extract for source operand. (*insvqi_3): Use any_shiftrt for source operand. testsuite/ChangeLog: 2017-01-18 Uros Bizjak <ubiz...@gmail.com> PR rtl-optimization/78952 * gcc.target/i386/pr78952-1.c: New test. * gcc.target/i386/pr78952-2.c: Ditto. Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}. Committed to mainline SVN. Uros.
Index: config/i386/i386.md =================================================================== --- config/i386/i386.md (revision 244578) +++ config/i386/i386.md (working copy) @@ -2960,13 +2960,15 @@ (subreg:SI (match_dup 1) 0)) (unspec [(const_int 0)] UNSPEC_NOREX_MEM)])]) +(define_code_iterator any_extract [sign_extract zero_extract]) + (define_insn "*insvqi_2" [(set (zero_extract:SI (match_operand 0 "ext_register_operand" "+Q") (const_int 8) (const_int 8)) - (zero_extract:SI (match_operand 1 "ext_register_operand" "Q") - (const_int 8) - (const_int 8)))] + (any_extract:SI (match_operand 1 "ext_register_operand" "Q") + (const_int 8) + (const_int 8)))] "" "mov{b}\t{%h1, %h0|%h0, %h1}" [(set_attr "type" "imov") @@ -2976,8 +2978,8 @@ [(set (zero_extract:SI (match_operand 0 "ext_register_operand" "+Q") (const_int 8) (const_int 8)) - (lshiftrt:SI (match_operand:SI 1 "register_operand" "Q") - (const_int 8)))] + (any_shiftrt:SI (match_operand:SI 1 "register_operand" "Q") + (const_int 8)))] "" "mov{b}\t{%h1, %h0|%h0, %h1}" [(set_attr "type" "imov") Index: testsuite/gcc.target/i386/pr78952-1.c =================================================================== --- testsuite/gcc.target/i386/pr78952-1.c (nonexistent) +++ testsuite/gcc.target/i386/pr78952-1.c (working copy) @@ -0,0 +1,21 @@ +/* PR target/78952 */ +/* { dg-do compile } */ +/* { dg-options "-O2 -masm=att" } */ +/* { dg-additional-options "-mregparm=3" { target ia32 } } */ +/* { dg-final { scan-assembler-not "movsbl" } } */ + +struct S1 +{ + char pad1; + char val; + short pad2; +}; + +struct S1 foo (struct S1 a, struct S1 b) +{ + a.val = b.val; + + return a; +} + +/* { dg-final { scan-assembler "\[ \t\]movb\[ \t\]+%.h, %.h" } } */ Index: testsuite/gcc.target/i386/pr78952-2.c =================================================================== --- testsuite/gcc.target/i386/pr78952-2.c (nonexistent) +++ testsuite/gcc.target/i386/pr78952-2.c (working copy) @@ -0,0 +1,21 @@ +/* PR target/78952 */ +/* { dg-do compile } */ +/* { dg-options "-O2 -masm=att" } */ +/* { dg-additional-options "-mregparm=3" { target ia32 } } */ +/* { dg-final { scan-assembler-not "sarl" } } */ + +struct S1 +{ + char pad1; + char val; + short pad2; +}; + +struct S1 foo (struct S1 a, int b) +{ + a.val = b >> 8; + + return a; +} + +/* { dg-final { scan-assembler "\[ \t\]movb\[ \t\]+%.h, %.h" } } */