On Fri, Oct 11, 2019 at 5:17 PM Wilco Dijkstra <wilco.dijks...@arm.com> wrote:
>
> Hi Ramana,
>
> >On Mon, Sep 9, 2019 at 6:03 PM Wilco Dijkstra <wilco.dijks...@arm.com> wrote:
> >>
> >> Currently arm_legitimize_address doesn't handle Thumb-2 at all, resulting 
> >> in
> >> inefficient code.  Since Thumb-2 supports similar address offsets use the 
> >> Arm
> >> legitimization code for Thumb-2 to get significant codesize and performance
> >> gains.  SPECINT2006 shows 0.4% gain on Cortex-A57, while SPECFP improves 
> >> 0.2%.
> >
> > What were the sort of code size gains  ? It did end up piquing my
> > curiosity as to how we missed something so basic.  For instance ldr
>
> h264ref is 1% smaller, various other benchmarks a few hundred bytes to 1KB
> smaller (~0.1%).
>
> > r0, [r0, #-4080] is valid in Arm state but not in Thumb2. Thus if
> >  there was an illegitimate address given here, would we end up
> > producing plus (r0, -4080) ? Yeah a simple testcase doesn't work out.
> > Scratching my head a bit , it's late at night.
>
> If the proposed offsets are not legal, GCC tries something different that is
> legal, hence there is no requirement to only propose legal splits. In any
> case we don't optimize the negative range even on Arm - the offsets are
> always split into 4KB ranges (not 8KB as would be possible).
>
> The negative offset splitting doesn't appear to have changed on Thumb-2
> so there is apparently a different mechanism that deals with negative offsets.
> So only positive offsets are improved with my patch:
>
> int f(int *p) { return p[1025] + p[1026]; }
>
> before:
>         movw    r3, #4104
>         movw    r2, #4100
>         ldr     r2, [r0, r2]
>         ldr     r0, [r0, r3]
>         add     r0, r0, r2
>         bx      lr
>
> after:
>         add     r3, r0, #4096
>         ldrd    r0, r3, [r3, #4]
>         add     r0, r0, r3
>         bx      lr
>
> > Orthogonally it looks like you can clean up the MINUS handling here
> > and in legitimate_address_p , I'm not sure what the status of LRA with
> > MINUS is either and thus we should now look to clean this up or look
> > to turn this on and see what happens. However that's a subject of a
> > future patch.
>
> Well there are lots of cases that aren't handled correctly or optimally
> yet - I'd copy the way I wrote aarch64_legitimize_address_displacement,
> but that's for a future patch indeed.
>
> > For the record, bootstrap with Thumb2  presumably and the testruns were 
> > clean ?
>
> Yes, at the time I ran them.

Yeah it dropped about 2660 bytes out of a Thumb2 bootstrap - so it
seems worth while.

Ok please apply but as usual watch out for any fallout from this.

regards
Ramana
>
> Cheers,
> Wilco

Reply via email to