Assuming a fully pipelined vector unit (and from experience on
AArch64), an u-arch's scalar-to-vector move cost is likely to play a
significant role in whether this will be profitable or not.

--Philipp.

On Wed, 31 May 2023 at 00:10, Jeff Law via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
>
>
> On 5/30/23 16:01, 钟居哲 wrote:
> > I agree with Andrew.
> >
> > And I don't think this patch is appropriate for following reasons:
> > 1. This patch increases vector workload in machine since
> >       it convert scalar load + vmv.v.x into vmv.v.i + vsll.vi.
> This is probably uarch dependent.  I can probably construct cases where
> the first will be better and I can probably construct cases where the
> latter will be better.  In fact the recommendation from our uarch team
> is to generally do this stuff on the vector side.
>
>
>
> > 2. For multi-issue OoO machine, scalar instructions are very cheap
> >      when they are located in vector codegen. For example a sequence
> >      like this:
> >        scalar insn
> >        scalar insn
> >        vector insn
> >        scalar insn
> > vector insn
> >        ....
> >        In such situation, we can issue multiple instructions simultaneously,
> >        and the latency of scalar instructions will be hided so scalar
> > instruction
> >        is cheap. Wheras this patch increasing vector pipeline workload
> > is not
> >        friendly to OoO machine what I mentioned above.
> I probably need to be careful what I say here :-)  I'll go with mixing
> vector/scalar code may incur certain penalties on some
> microarchitectures depending on the exact code sequences involved.
>
>
> > 3.   I can image the only benefit of this patch is that we can reduce
> > scalar register pressure
> >        in some extreme circumstances. However, I don't this benefit is
> > "real" since GCC should
> >        well schedule the instruction sequence when we well tune the
> > vector instructions scheduling
> >        model and cost model to make such register live range very short
> > when the scalar register
> >        pressure is very high.
> >
> > Overal, I disagree with this patch.
> What I think this all argues is that it'll likely need to be uarch
> dependent.    I'm not yet sure how to describe the properties of the
> uarch in a concise manner to put into our costing structure yet though.
>
> jeff

Reply via email to