https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62080
--- Comment #6 from Richard Biener <rguenth at gcc dot gnu.org> --- (In reply to Marc Glisse from comment #5) > With the intrinsics patch, I notice that we don't simplify in gimple either: > > _40 = VIEW_CONVERT_EXPR<__m128i>(_39); > MEM[(__m128i * {ref-all})vec_4(D)] = _40; > _60 = MEM[(const double *)vec_4(D)]; > _61 = MEM[(const double *)vec_4(D) + 8B]; > _62 = {_60, _61}; > _63 = VIEW_CONVERT_EXPR<__v4si>(_62); > > (_39 and _63 have the same type) value-numbering has difficulties in seeing through so much stmts (read: not implemented) and it doesn't have a way of expressing "partial" values. That is, it knows that at MEM[(__m128i * {ref-all})vec_4(D)] we stored _39 but when value-numbering the partial reads it can't assign the value _39 to them (as said, "partial" values are not supported). So one way to optimize this is to special-case the composition operations and try looking up a proper memory operation. Another possibility is to value-number compound operations also as piecewise operations, introducing fake value-numbers (that is, "lower" everything to component-wise operations internally). I suppose pattern-matching > _60 = MEM[(const double *)vec_4(D)]; > _61 = MEM[(const double *)vec_4(D) + 8B]; > _62 = {_60, _61}; and generating a single read (with eventually a permute?) would be more profitable and easier to implement.