Bernd Schmidt <ber...@codesourcery.com> writes:
>> Is there no possibility of running the optimisation in cfglayout mode
>> instead?  It seems from this and the forwarder block stuff above as
>> though it might make things easier, but maybe not.
>
> I'm not sure what you mean here. This reordering isn't for the sake of
> the optimization code; it's necessary due to the encoding of the
> Blackfin LSETUP instruction which assumes that the loop start label is
> before the loop end label.
>
> The optimizing functions definitely need to see RTL that corresponds
> directly to generated assembly, since they need to count instruction
> sizes. So I think cfglayout is out for that.

Yeah, good point.

>>> +  /* Every loop contains in its list of inner loops every loop nested 
>>> inside
>>> +     it, even if there are intermediate loops.  This works because we're 
>>> doing
>>> +     a depth-first search here and never visit a loop more than once.  */
>>> +  for (ix = 0; VEC_iterate (loop_info, loop->loops, ix, inner); ix++)
>>> +    {
>>> +      optimize_loop (inner, hooks);
>> 
>> Perhaps use a worklist here, to avoid O(loops)-deep recursion?
>
> Well, it's O(counting loop nesting level) deep, which I think isn't much
> of a problem in practice. One could even argue that it's a O(1) since
> the register allocator will eventually run out of loop iteration registers.

Ah yeah, I hadn't thought about FIRST_PSEUDO_REGISTERS being an
effective cap on it.  I think it'd be worth adding a comment along
the lines of:

  Note that the depth of recursion is limited by FIRST_PSEUDO_REGISTER,
  because loops are tied directly to an iterator register.

above the inner optimize_loops calls (or whereever you think is best),
just to emphasise why recursion isn't a problem.

> Thanks for the review - new patch below. I think this addresses
> everything.

Yeah, looks good.  Thanks especially for all the extra commentary,
both in the "headline" places and elsewhere.  It really helps
clarify what's going on.

>       * hw-doloop.c: New file.
>       * hw-doloop.h: New file.
>       * Makefile.in (OBJS): Add hw-doloop.o.
>       (hw-doloop.o): New rule.
>       ($(obj_out_file)): Add hw-doloop.h dependency.
>       * config/bfin/bfin.c: Include "hw-doloop.h".
>       (loop_info, DEF_VEC_P for loop_info, loop_info_d): Remove.
>       (bfin_dump_loops, bfin_bb_in_loop, bfin_scan_loop): Remove.
>       (hwloop_optimize): Renamed from bfin_optimize_loop.  Argument
>       type changed to hwloop_info.  Return bool, true if the loop was
>       successfully optimized.  Remove code that was moved to
>       hw-doloop.c, and adjust other parts.
>       (hwloop_fail): New static function, containing parts that used
>       to be in bfin_optimize_loop.
>       (bfin_discover_loop, bfin_discover_loops, free_loops,
>       bfin_reorder_loops): Remove.
>       (hwloop_pattern_reg): New static function.
>       (bfin_doloop_hooks): New variable.
>       (bfin_reorg_loops): Remove most code, call reorg_loops.
>       * config/bfin/bfin.md (doloop_end splitter): Also enable if
>       loop counter is a memory_operand.

The target-independent bits are OK with me with the changes below.
Obviously everything else is in your port anyway.

> +/* Try to identify a forwarder blocks that jump into LOOP, and add it to

s/blocks/block/

> +#define BB_AUX_INDEX(BB) ((intptr_t)(BB)->aux)

Missing space before (BB).

> +/* Initialize the aux fields to give ascending indices to basic blocks.  */
> +static void
> +set_bb_indices (void)
> +{
> +  basic_block bb;
> +  unsigned index;
> +
> +  index = 0;
> +  FOR_EACH_BB (bb)
> +    bb->aux = (PTR)index++;

s/(PTR)/(void *) /.  (I always associate PTR with the old K&R stuff.)
More importantly, "index" needs to be intptr_t in order to avoid warnings.

(Could you bootstrap this on x86_64 to check for things like that?
A C bootstrap only should be fine of course, since the code isn't
going to be run.)

> +  hwloop_info loops = NULL;

Unnecessary initialisation (or at least, it should be).

Richard

Reply via email to