On Fri, Mar 4, 2022 at 8:40 AM Richard Biener via Gcc-patches
wrote:
>
>
>
> > Am 04.03.2022 um 03:30 schrieb Hongtao Liu via Gcc-patches
> > :
> >
> > On Fri, Mar 4, 2022 at 10:29 AM liuhongt via Gcc-patches
> > wrote:
> >>
> >> This is incremental patch based on [1], it enables optimization a
> Am 04.03.2022 um 03:30 schrieb Hongtao Liu via Gcc-patches
> :
>
> On Fri, Mar 4, 2022 at 10:29 AM liuhongt via Gcc-patches
> wrote:
>>
>> This is incremental patch based on [1], it enables optimization as below
>>
>> - vbroadcastss.LC1(%rip), %xmm0
>> + movl$-45, %ed
On Fri, Mar 4, 2022 at 3:28 AM liuhongt wrote:
>
> This is incremental patch based on [1], it enables optimization as below
>
> - vbroadcastss.LC1(%rip), %xmm0
> + movl$-45, %edx
> + vmovd %edx, %xmm0
> + vpshufd $0, %xmm0, %xmm0
>
> According to microbenchmark, i
On Fri, Mar 4, 2022 at 10:29 AM liuhongt via Gcc-patches
wrote:
>
> This is incremental patch based on [1], it enables optimization as below
>
> - vbroadcastss.LC1(%rip), %xmm0
> + movl$-45, %edx
> + vmovd %edx, %xmm0
> + vpshufd $0, %xmm0, %xmm0
>
> According to
This is incremental patch based on [1], it enables optimization as below
- vbroadcastss.LC1(%rip), %xmm0
+ movl$-45, %edx
+ vmovd %edx, %xmm0
+ vpshufd $0, %xmm0, %xmm0
According to microbenchmark, it's faster than broadcast from memory.
[1] https://gcc.gnu.org/