On Thu, Jun 27, 2019 at 09:24:58AM -0600, Jeff Law wrote: > On 6/27/19 12:05 AM, Jakub Jelinek wrote: > > On Wed, Jun 26, 2019 at 12:19:28PM +0200, Uros Bizjak wrote: > >> Yes, the patch works OK. I'll regression test it and push it later today. > > > > I think it caused > > +FAIL: gcc.dg/tree-ssa/pr84512.c scan-tree-dump optimized "return 285;" > > which admittedly already is xfailed on various targets. > > We now newly vectorize those loops and there is no FRE or similar pass > > after vectorization to clean it up, in particular optimize the > > a[8] and a[9] loads given the MEM <vector(2) int> [(int *)&a + 32B] > > store: > > MEM <vector(2) int> [(int *)&a + 32B] = { 64, 81 }; > > _13 = a[8]; > > res_6 = _13 + 140; > > _18 = a[9]; > > res_15 = res_6 + _18; > > a ={v} {CLOBBER}; > > return res_15; > > > > Shall we xfail it, or is there a plan to enable FRE after vectorization, > > or similar pass that would be able to do similar memory optimizations? > > Note, the RTL passes are able to optimize it in the end in this testcase. > I wonder if we could logically break up the vector store within DOM. If > we did that we'd end up with a[8] and a[9] in DOM's expression hash > table. That would allow us to replace the loads into _13 and _18 with > constants and the rest should just fall out. > > Care to open a BZ? If so, go ahead and assign it to me.
I think Richi is on working on adding fre3 now. Jakub