On Thu, Dec 2, 2021 at 6:07 AM Uros Bizjak <ubiz...@gmail.com> wrote:
>
> Introduce vec_set_0 pattern for V8HI and V8HF modes to implement scalar
> element 0 inserts to from a GP register, SSE register or memory.  Also
> add V8HI and V8HF AVX2 (x,x,x) alternative to PINSR insn pattern, which is
> split after reload to a sequence of PBROADCASTW and PBLENDW.
>
> The V8HF inserts from memory improve from:
>
> -       vpbroadcastw    4(%esp), %xmm1
> -       vpblendw        $16, %xmm1, %xmm0, %xmm0
> +       vpinsrw $4, 4(%esp), %xmm0, %xmm0
>
> and V8HF inserts from SSE register to element 0 improve from:
>
>         vpxor   %xmm2, %xmm2, %xmm2
> -       vpbroadcastw    %xmm0, %xmm0
>         vpblendw        $1, %xmm0, %xmm2, %xmm0
>
> Based on the above improvements, the register allocator is able to determine
> the optimal instruction (or instruction sequence) based on the register set
> of the input value, so there is no need to manually expand V8HI and V8HF
> inserts to the sequence of VEC_DUPLICATE and VEC_MERGE RTXes.
>
Thanks.
> 2021-12-01  Uroš Bizjak  <ubiz...@gmail.com>
>
> gcc/ChangeLog:
>
>     PR target/102811
>     * config/i386/sse.md (VI2F): Remove mode iterator.
>     (VI2F_256_512): New mode iterator.
>     (vec_set<V8_128:mode>_0): New insn pattern.
>     (vec_set<VI2F_256_512:mode>_0>): Rename from vec_set<VI2F:mode>mode.
>     Use VI2F_256_512 mode iterator instead of VI2F.
>     (*axv512fp16_movsh): Remove.
>     (<sse2p4_1>_pinsr<ssemodesuffix>): Add (x,x,x) AVX2 alternative.
>     Do not disable V8HF mode insn on AVX2 targets.
>     (pinsrw -> pbroadcast + pblendw peephole2): New peephole.
>     (pinsrw -> pbroadcast + pblendw splitter): New post-reload splitter.
>     * config/i386/i386.md (extendhfsf): Call gen_vec_setv8hf_0.
>     * config/i386/i386-expand.c (ix86_expand_vector_set)
>     <case E_V8HFmode>: Use vec_merge path for TARGET_AVX2.
>
> gcc/testsuite/ChangeLog:
>
>     PR target/102881
>     * gcc.target/i386/pr102811-1.c: New test.
>     * gcc.target/i386/avx512fp16-1c.c (dg-final): Update
>     scan-assembler-times scan strings for ia32 targets.
>     * gcc.target/i386/pr102327-1.c (dg-final): Ditto.
>     * gcc.target/i386/pr102811.c: Rename from ...
>     * gcc.target/i386/avx512vl-vcvtps2ph-pr102811.c: ... this.
>
> Bootstrapped and regression tested on x86_64-linux-gnu {,-m32} w/ and
> w/o -mf16c.
>
> Pushed to master.
>
> Uros.



-- 
BR,
Hongtao

Reply via email to