https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84935

--- Comment #6 from rguenther at suse dot de <rguenther at suse dot de> ---
On Mon, 19 Mar 2018, jakub at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84935
> 
> --- Comment #5 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
> Seems it actually is vectorized, probably just using DImode vectors for
> 2xSImode,
> and dom doesn't handle vector stores followed by scalar loads.  Before
> store-merging the dump is:
>   MEM[(int *)&a] = { 0, 1 };
>   MEM[(int *)&a + 8B] = { 4, 9 };
>   MEM[(int *)&a + 16B] = { 16, 25 };
>   MEM[(int *)&a + 24B] = { 36, 49 };
>   MEM[(int *)&a + 32B] = { 64, 81 };
>   _6 = a[0];
>   _28 = a[1];
>   res_29 = _6 + _28;
>   _35 = a[2];
>   res_36 = res_29 + _35;
>   _42 = a[3];
>   res_43 = res_36 + _42;
>   _49 = a[4];
>   res_50 = res_43 + _49;
>   _56 = a[5];
>   res_57 = res_50 + _56;
>   _63 = a[6];
>   res_64 = res_57 + _63;
>   _70 = a[7];
>   res_71 = res_64 + _70;
>   _77 = a[8];
>   res_78 = res_71 + _77;
>   _2 = a[9];
>   res_11 = _2 + res_78;
>   a ={v} {CLOBBER};
>   return res_11;
> and nothing really changes till *.optimized in it.

Similar as with nvptx.  Yes, DOM doesn't handle this while FRE would.

Reply via email to