Hi! If we try to simplify a paradoxical subreg of a CONST_WIDE_INT, we can ICE, because wi::extract_uhwi doesn't like accessing bytes beyond the precision. for (i = 0; i < elem_bitsize; i += value_bit) *vp++ = wi::extract_uhwi (val, i, value_bit); for (; i < elem_bitsize; i += value_bit) *vp++ = extend; looks wrong, because the second loop is useless. But, we actually do want the second loop, to handle the paradoxical bytes.
Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2016-11-25 Jakub Jelinek <ja...@redhat.com> PR rtl-optimization/78526 * simplify-rtx.c (simplify_immed_subreg): Don't use wi::extract_uhwi beyond val's precision. * gcc.dg/pr78526.c: New test. --- gcc/simplify-rtx.c.jj 2016-11-18 20:04:28.000000000 +0100 +++ gcc/simplify-rtx.c 2016-11-25 15:07:13.872322389 +0100 @@ -5740,8 +5740,9 @@ simplify_immed_subreg (machine_mode oute { rtx_mode_t val = rtx_mode_t (el, innermode); unsigned char extend = wi::sign_mask (val); + int prec = wi::get_precision (val); - for (i = 0; i < elem_bitsize; i += value_bit) + for (i = 0; i < prec && i < elem_bitsize; i += value_bit) *vp++ = wi::extract_uhwi (val, i, value_bit); for (; i < elem_bitsize; i += value_bit) *vp++ = extend; --- gcc/testsuite/gcc.dg/pr78526.c.jj 2016-11-25 15:15:53.266799450 +0100 +++ gcc/testsuite/gcc.dg/pr78526.c 2016-11-25 15:16:12.004564146 +0100 @@ -0,0 +1,21 @@ +/* PR rtl-optimization/78526 */ +/* { dg-do compile { target int128 } } */ +/* { dg-options "-O -fno-tree-ccp -fno-tree-sra -g -w" } */ +/* { dg-additional-options "-mavx512bw" { target i?86-*-* x86_64-*-* } } */ + +typedef unsigned U __attribute__ ((vector_size (64))); +typedef unsigned __int128 V __attribute__ ((vector_size (64))); + +static inline V +bar (U u, U x, V v) +{ + v = (V)(U) { 0, ~0 }; + v[x[0]] <<= u[-63]; + return v; +} + +V +foo (U u) +{ + return bar (u, (U) {}, (V) {}); +} Jakub