Re: [PATCH] [i386] Optimize v4si broadcast for noavx512vl.

2022-03-04 Thread H.J. Lu via Gcc-patches
On Fri, Mar 4, 2022 at 8:40 AM Richard Biener via Gcc-patches wrote: > > > > > Am 04.03.2022 um 03:30 schrieb Hongtao Liu via Gcc-patches > > : > > > > On Fri, Mar 4, 2022 at 10:29 AM liuhongt via Gcc-patches > > wrote: > >> > >> This is incremental patch based on [1], it enables optimization a

Re: [PATCH] [i386] Optimize v4si broadcast for noavx512vl.

2022-03-04 Thread Richard Biener via Gcc-patches
> Am 04.03.2022 um 03:30 schrieb Hongtao Liu via Gcc-patches > : > > On Fri, Mar 4, 2022 at 10:29 AM liuhongt via Gcc-patches > wrote: >> >> This is incremental patch based on [1], it enables optimization as below >> >> - vbroadcastss.LC1(%rip), %xmm0 >> + movl$-45, %ed

Re: [PATCH] [i386] Optimize v4si broadcast for noavx512vl.

2022-03-04 Thread Uros Bizjak via Gcc-patches
On Fri, Mar 4, 2022 at 3:28 AM liuhongt wrote: > > This is incremental patch based on [1], it enables optimization as below > > - vbroadcastss.LC1(%rip), %xmm0 > + movl$-45, %edx > + vmovd %edx, %xmm0 > + vpshufd $0, %xmm0, %xmm0 > > According to microbenchmark, i

Re: [PATCH] [i386] Optimize v4si broadcast for noavx512vl.

2022-03-03 Thread Hongtao Liu via Gcc-patches
On Fri, Mar 4, 2022 at 10:29 AM liuhongt via Gcc-patches wrote: > > This is incremental patch based on [1], it enables optimization as below > > - vbroadcastss.LC1(%rip), %xmm0 > + movl$-45, %edx > + vmovd %edx, %xmm0 > + vpshufd $0, %xmm0, %xmm0 > > According to

[PATCH] [i386] Optimize v4si broadcast for noavx512vl.

2022-03-03 Thread liuhongt via Gcc-patches
This is incremental patch based on [1], it enables optimization as below - vbroadcastss.LC1(%rip), %xmm0 + movl$-45, %edx + vmovd %edx, %xmm0 + vpshufd $0, %xmm0, %xmm0 According to microbenchmark, it's faster than broadcast from memory. [1] https://gcc.gnu.org/