On Tue, 29 Nov 2022 at 20:43, Andrew Pinski <pins...@gmail.com> wrote:
>
> On Tue, Nov 29, 2022 at 6:40 AM Prathamesh Kulkarni via Gcc-patches
> <gcc-patches@gcc.gnu.org> wrote:
> >
> > Hi,
> > For the following test-case:
> >
> > int16x8_t foo(int16_t x, int16_t y)
> > {
> >   return (int16x8_t) { x, y, x, y, x, y, x, y };
> > }
>
> (Not to block this patch)
> Seems like this trick can be done even with less than perfect initializer too:
> e.g.
> int16x8_t foo(int16_t x, int16_t y)
> {
>   return (int16x8_t) { x, y, x, y, x, y, x, 0 };
> }
>
> Which should generate something like:
> dup v0.8h, w0
> dup v1.8h, w1
> zip1 v0.8h, v0.8h, v1.8h
> ins v0.h[7], wzr
Hi Andrew,
Nice catch, thanks for the suggestions!
More generally, code-gen with constants involved seems to be sub-optimal.
For example:
int16x8_t foo(int16_t x)
{
  return (int16x8_t) { x, x, x, x, x, x, x, 1 };
}

results in:
foo:
        movi    v0.8h, 0x1
        ins     v0.h[0], w0
        ins     v0.h[1], w0
        ins     v0.h[2], w0
        ins     v0.h[3], w0
        ins     v0.h[4], w0
        ins     v0.h[5], w0
        ins     v0.h[6], w0
        ret

which I suppose could instead be the following ?
foo:
        dup     v0.8h, w0
        mov    w1, 0x1
        ins       v0.h[7], w1
        ret

I will try to address this in follow up patch.

Thanks,
Prathamesh

>
> Thanks,
> Andrew Pinski
>
>
> >
> > Code gen at -O3:
> > foo:
> >         dup    v0.8h, w0
> >         ins     v0.h[1], w1
> >         ins     v0.h[3], w1
> >         ins     v0.h[5], w1
> >         ins     v0.h[7], w1
> >         ret
> >
> > For 16 elements, it results in 8 ins instructions which might not be
> > optimal perhaps.
> > I guess, the above code-gen would be equivalent to the following ?
> > dup v0.8h, w0
> > dup v1.8h, w1
> > zip1 v0.8h, v0.8h, v1.8h
> >
> > I have attached patch to do the same, if number of elements >= 8,
> > which should be possibly better compared to current code-gen ?
> > Patch passes bootstrap+test on aarch64-linux-gnu.
> > Does the patch look OK ?
> >
> > Thanks,
> > Prathamesh

Reply via email to