On Tue, Oct 9, 2018 at 3:28 PM Richard Biener wrote:
>
> On Mon, 8 Oct 2018, Richard Biener wrote:
>
> > On Fri, 5 Oct 2018, Uros Bizjak wrote:
> >
> > > On Thu, Oct 4, 2018 at 2:05 PM Richard Biener wrote:
> > > >
> > > >
> > > > This tries to apply the same trick to sminmax reduction patterns
>
On Mon, 8 Oct 2018, Richard Biener wrote:
> On Fri, 5 Oct 2018, Uros Bizjak wrote:
>
> > On Thu, Oct 4, 2018 at 2:05 PM Richard Biener wrote:
> > >
> > >
> > > This tries to apply the same trick to sminmax reduction patterns
> > > as for the reduc_plus_scal ones, namely reduce %zmm -> %ymm -> %x
On Fri, 5 Oct 2018, Uros Bizjak wrote:
> On Thu, Oct 4, 2018 at 2:05 PM Richard Biener wrote:
> >
> >
> > This tries to apply the same trick to sminmax reduction patterns
> > as for the reduc_plus_scal ones, namely reduce %zmm -> %ymm -> %xmm
> > first. On a microbenchmark this improves performa
On Thu, Oct 4, 2018 at 2:05 PM Richard Biener wrote:
>
>
> This tries to apply the same trick to sminmax reduction patterns
> as for the reduc_plus_scal ones, namely reduce %zmm -> %ymm -> %xmm
> first. On a microbenchmark this improves performance on Zen
> by ~30% for AVX2 and on Skylake-SP by ~
On Thu, 4 Oct 2018, Richard Biener wrote:
>
> This tries to apply the same trick to sminmax reduction patterns
> as for the reduc_plus_scal ones, namely reduce %zmm -> %ymm -> %xmm
> first. On a microbenchmark this improves performance on Zen
> by ~30% for AVX2 and on Skylake-SP by ~10% for AVX5
This tries to apply the same trick to sminmax reduction patterns
as for the reduc_plus_scal ones, namely reduce %zmm -> %ymm -> %xmm
first. On a microbenchmark this improves performance on Zen
by ~30% for AVX2 and on Skylake-SP by ~10% for AVX512 (for AVX2
there's no measurable difference).
I g