On Wed, Sep 25, 2024 at 5:32 PM Aleksandar Rakic
<aleksandar.ra...@htecgroup.com> wrote:
>
> Hi,
>
> I think I managed to fix indentation from the previous version.
>
> When comparing the tables showing the candidates for the group 1 before
> and after applying this patch, it can be observed that complexities for
> the candidates where the computation depends on the invariant
> expressions or the invariant variables should be at least one, which
> aligns with the approach used in the commit c2b64ce.

Commit c2b64ce is

+2017-05-11  Bin Cheng  <bin.ch...@arm.com>
+
+       * tree-ssa-address.c (struct mem_address): Move to header file.
+       (valid_mem_ref_p, move_fixed_address_to_symbol): Make it global.
+       * tree-ssa-address.h (struct mem_address): Move from C file.
+       (valid_mem_ref_p, move_fixed_address_to_symbol): Declare.

that hasn't any "approach" to complexity, it just makes a function global.

>
> ===== Before this patch =====
> Group 1:
>   cand  cost    compl.  inv.expr.       inv.vars
>   1     11      0       5;      NIL;
>   2     11      0       6;      NIL;
>   4     8       0       7;      NIL;
>   5     9       0       8;      NIL;
>   6     1       0       NIL;    NIL;
>   7     1       1       NIL;    NIL;
>   9     7       0       5;      NIL;
> ===== Before this patch =====
> ===== After this patch =====
> Group 1:
>   cand  cost    compl.  inv.expr.       inv.vars
>   1     11      2       4;      NIL;

why does complexity go up to 2 from 0 here?

>   2     11      1       4;      NIL;
>   4     8       1       5;      NIL;
>   5     8       2       6;      NIL;

Likewise, and why does cost change?

>   6     1       0       NIL;    NIL;
>   7     1       1       NIL;    NIL;
>   9     7       2       4;      NIL;

Likewise.

This comparison is probably not very useful without showing the actual candidate
and its uses?  The above before/after figures do not match the testcase
ontop of trunk.

> ===== After this patch =====
>
> Hence, if the invariant expressions or the invariant variables are used
> when representing use with candidate, the complexity should be larger
> for more complex expressions, so it is incremented by one. I am not sure
> whether inv_present could be expressed as parts.

The testcase looks mips specific - it has just scan-tree-dump-not which
is probably easily confused to PASS when it regressed.  Can you instead
add a gcc.target/mips/ testcase that scans for actual assembler features?
If the testcase relies on inlining daxpy then declaring that static helps that
you just get dgefa in the final assembly.  If you want to scan both functions
I suggest to split the testcase into two to make it more reliable.

I see r15-5650-gd9c908b7503965 for a --target=mips64-r6-linux-gnu generates
for the innermost loop of dgefa

.L12:
        addu    $3,$9,$2
        addu    $3,$3,$8
        lwc1    $f1,0($3)
        lwc1    $f0,0($2)
        addiu   $7,$7,1
        mul.s   $f1,$f2,$f1
        addiu   $2,$2,4
        slt     $3,$7,$10
        add.s   $f0,$f0,$f1
        .set    noreorder
        .set    nomacro
        bne     $3,$0,.L12

and with the patch

.L12:
        addu    $3,$9,$2
        addu    $3,$3,$8
        lwc1    $f1,0($3)
        lwc1    $f0,0($2)
        addiu   $7,$7,1
        mul.s   $f1,$f2,$f1
        addiu   $2,$2,4
        slt     $3,$7,$10
        add.s   $f0,$f0,$f1
        .set    noreorder
        .set    nomacro
        bne     $3,$0,.L12

that's suspiciously identical?!  In fact the whole testcase generates
identical code.

So besides not being able to see the actual problem (maybe I need some
-march/-mtune?)
the actual issue I have with the patch is that aff_inv is tried to be
distributed to other
components and for parts that fail to be distributed we cost it via

  if (comp_inv != NULL_TREE)
    cost = force_var_cost (data, comp_inv, inv_vars);

simply ensuring the complexity is at least one would have been to change that to

  if (comp_inv != NULL_TREE)
    {
      cost = force_var_cost (data, comp_inv, inv_vars);
      /* Ensure complexity is at least one.  */
      cost.complexity = MAX (1, cost.complexity);
    }

or alternatively just do that for the if (comp_inv && inv_expr &&
!simple_inv) case
(it's a bit odd we adjust cost there only for 'inv_expr != NULL').

The patch you posted instead of just adjusting complexity seems to change
the way we distribute the invariant - in particular we now distribute it to
parts.offset even when that is not supported (!(ok_with_ratio_p ||
ok_without_ratio_p)),
that's an odd change.

complexity is supposed to be a tie-breaker, so I think that having
bigger complexity
for when we can't move it fully to index is OK - in the end any part
that cannot be
moved will end up being applied to base I think (that we have
essentially two functions
for this, one for costing and one for actual code emission, is a bit
unfortunate).

In the end I can't ack this patch as I cannot reproduce an effect on
the testcase
you added and because of the odd change part of the patch which is clearly not
only doing what it's description says.

Thanks,
Richard.


> Regards,
> Aleksandar

Reply via email to