On Tue, Dec 3, 2013 at 1:50 PM, Yufeng Zhang <yufeng.zh...@arm.com> wrote:
> On 12/03/13 06:48, Jeff Law wrote:
>>
>> On 12/02/13 08:47, Yufeng Zhang wrote:
>>>
>>> Ping~
>>>
>>> http://gcc.gnu.org/ml/gcc-patches/2013-11/msg03360.html
>>
>>
>>>
>>> Thanks,
>>> Yufeng
>>>
>>> On 11/26/13 15:02, Yufeng Zhang wrote:
>>>>
>>>> On 11/26/13 12:45, Richard Biener wrote:
>>>>>
>>>>> On Thu, Nov 14, 2013 at 12:25 AM, Yufeng
>>>>> Zhang<yufeng.zh...@arm.com>    wrote:
>>>>>>
>>>>>> On 11/13/13 20:54, Bill Schmidt wrote:
>>>>>>>
>>>>>>> The second version of your original patch is ok with me with the
>>>>>>> following changes.  Sorry for the little side adventure into the
>>>>>>> next-interp logic; in the end that's going to hurt more than it
>>>>>>> helps in
>>>>>>> this case.  Thanks for having a look at it, anyway.  Thanks also for
>>>>>>> cleaning up this version to be less intrusive to common interfaces; I
>>>>>>> appreciate it.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Thanks a lot for the review.  I've attached an updated patch with the
>>>>>> suggested changes incorporated.
>>>>>>
>>>>>> For the next-interp adventure, I was quite happy to do the
>>>>>> experiment; it's
>>>>>> a good chance of gaining insight into the pass.  Many thanks for
>>>>>> your prompt
>>>>>> replies and patience in guiding!
>>>>>>
>>>>>>
>>>>>>> Everything else looks OK to me.  Please ask Richard for final
>>>>>>> approval,
>>>>>>> as I'm not a maintainer.
>>
>> First a note, I need to check on voting for Bill as the slsr maintainer
>> from the steering committee.   Voting was in progress just before the
>> close of stage1 development so I haven't tallied the results :-)
>
>
> Looking forward to some good news! :)
>
>
>>>>
>>>> Yes, you are right about the non-trivial 'base' tree are rarely shared.
>>>>     The cached is introduced mainly because get_alternative_base () may
>>>> be
>>>> called twice on the same 'base' tree, once in the
>>>> find_basis_for_candidate () for look-up and the other time in
>>>> alloc_cand_and_find_basis () for record_potential_basis ().  I'm happy
>>>> to leave out the cache if you think the benefit is trivial.
>>
>> Without some sense of how expensive the lookups are vs how often the
>> cache hits it's awful hard to know if the cache is worth it.
>>
>> I'd say take it out unless you have some sense it's really saving time.
>>    It's a pretty minor implementation detail either way.
>
>
> I think the affine tree routines are generally expensive; it is worth having
> a cache to avoid calling them too many times.  I run the slsr-*.c tests
> under gcc.dg/tree-ssa/ and find out that the cache hit rates range from
> 55.6% to 90%, with 73.5% as the average.  The samples may not well represent
> the real world scenario, but they do show the fact that the 'base' tree can
> be shared to some extent.  So I'd like to have the cache in the patch.
>
>
>>
>>>>
>>>>> +/* { dg-do compile } */
>>>>> +/* { dg-options "-O2 -fdump-tree-slsr" } */
>>>>> +
>>>>> +typedef int arr_2[50][50];
>>>>> +
>>>>> +void foo (arr_2 a2, int v1)
>>>>> +{
>>>>> +  int i, j;
>>>>> +
>>>>> +  i = v1 + 5;
>>>>> +  j = i;
>>>>> +  a2 [i-10] [j] = 2;
>>>>> +  a2 [i] [j++] = i;
>>>>> +  a2 [i+20] [j++] = i;
>>>>> +  a2 [i-3] [i-1] += 1;
>>>>> +  return;
>>>>> +}
>>>>> +
>>>>> +/* { dg-final { scan-tree-dump-times "MEM" 5 "slsr" } } */
>>>>> +/* { dg-final { cleanup-tree-dump "slsr" } } */
>>>>>
>>>>> scanning for 5 MEMs looks non-sensical.  What transform do
>>>>> you expect?  I see other slsr testcases do similar non-sensical
>>>>> checking which is bad, too.
>>>>
>>>>
>>>> As the slsr optimizes CAND_REF candidates by simply lowering them to
>>>> MEM_REF from e.g. ARRAY_REF, I think scanning for the number of MEM_REFs
>>>> is an effective check.  Alternatively, I can add a follow-up patch to
>>>> add some dumping facility in replace_ref () to print out the replacing
>>>> actions when -fdump-tree-slsr-details is on.
>>
>> I think adding some details to the dump and scanning for them would be
>> better.  That's the only change that is required for this to move forward.
>
>
> I've updated to patch to dump more details when -fdump-tree-slsr-details is
> on.  The tests have also been updated to scan for these new dumps instead of
> MEMs.
>
>
>>
>> I suggest doing it quickly.  We're well past stage1 close at this point.
>
>
> The bootstrapping on x86_64 is still running.  OK to commit if it succeeds?

I still don't like it.  It's using the wrong and too expensive tools to do
stuff.  What kind of bases are we ultimately interested in?  Browsing
the code it looks like we're having

  /* Base expression for the chain of candidates:  often, but not
     always, an SSA name.  */
  tree base_expr;

which isn't really too informative but I suppose they are all
kind-of-gimple_val()s?  That said, I wonder if you can simply
use get_addr_base_and_unit_offset in place of get_alternative_base (),
ignoring the returned offset.

Richard.

> Thanks,
>
> Yufeng
>
> gcc/
>
>         * gimple-ssa-strength-reduction.c: Include tree-affine.h.
>         (name_expansions): New static variable.
>         (alt_base_map): Ditto.
>         (get_alternative_base): New function.
>         (find_basis_for_candidate): For CAND_REF, optionally call
>         find_basis_for_base_expr with the returned value from
>         get_alternative_base.
>         (record_potential_basis): Add new parameter 'base' of type 'tree';
>         add an assertion of non-NULL base; use base to set node->base_expr.
>         (alloc_cand_and_find_basis): Update; call record_potential_basis
>         for CAND_REF with the returned value from get_alternative_base.
>         (replace_refs): Dump details on the replacing.
>
>         (execute_strength_reduction): Call pointer_map_create for
>         alt_base_map; call free_affine_expand_cache with &name_expansions.
>
> gcc/testsuite/
>
>         * gcc.dg/tree-ssa/slsr-39.c: Update.
>         * gcc.dg/tree-ssa/slsr-41.c: New test.

Reply via email to