Re: [PATCH] x86: correct and improve "*vec_dupv2di"

2023-06-15 Thread Uros Bizjak via Gcc-patches
On Thu, Jun 15, 2023 at 10:15 AM Jan Beulich wrote: > > On 15.06.2023 09:45, Hongtao Liu wrote: > > On Thu, Jun 15, 2023 at 3:07 PM Uros Bizjak via Gcc-patches > > wrote: > >> On Thu, Jun 15, 2023 at 8:03 AM Jan Beulich via Gcc-patches > >> wrote: > >>> +case 3: > >>> + return "%vmovddu

Re: [PATCH] x86: correct and improve "*vec_dupv2di"

2023-06-15 Thread Jan Beulich via Gcc-patches
On 15.06.2023 09:45, Hongtao Liu wrote: > On Thu, Jun 15, 2023 at 3:07 PM Uros Bizjak via Gcc-patches > wrote: >> On Thu, Jun 15, 2023 at 8:03 AM Jan Beulich via Gcc-patches >> wrote: >>> +case 3: >>> + return "%vmovddup\t{%1, %0|%0, %1}"; >>> +case 4: >>> + return "movlhps\t%0,

Re: [PATCH] x86: correct and improve "*vec_dupv2di"

2023-06-15 Thread Hongtao Liu via Gcc-patches
On Thu, Jun 15, 2023 at 3:07 PM Uros Bizjak via Gcc-patches wrote: > > On Thu, Jun 15, 2023 at 8:03 AM Jan Beulich via Gcc-patches > wrote: > > > > The input constraint for the %vmovddup alternative was wrong, as the > > upper 16 XMM registers require AVX512VL to be used with this insn. To > > co

Re: [PATCH] x86: correct and improve "*vec_dupv2di"

2023-06-15 Thread Uros Bizjak via Gcc-patches
On Thu, Jun 15, 2023 at 8:03 AM Jan Beulich via Gcc-patches wrote: > > The input constraint for the %vmovddup alternative was wrong, as the > upper 16 XMM registers require AVX512VL to be used with this insn. To > compensate, introduce a new alternative permitting all 32 registers, by > broadcasti

[PATCH] x86: correct and improve "*vec_dupv2di"

2023-06-14 Thread Jan Beulich via Gcc-patches
The input constraint for the %vmovddup alternative was wrong, as the upper 16 XMM registers require AVX512VL to be used with this insn. To compensate, introduce a new alternative permitting all 32 registers, by broadcasting to the full 512 bits in that case if AVX512VL is not available. gcc/