Re: [RFC][PATCH] Introduce -fdump*-folding

Richard Biener Tue, 09 May 2017 05:53:31 -0700

On Tue, May 9, 2017 at 2:46 PM, Martin Liška <mli...@suse.cz> wrote:
> On 05/09/2017 02:16 PM, Richard Biener wrote:
>> On Tue, May 9, 2017 at 2:01 PM, Martin Liška <mli...@suse.cz> wrote:
>>> On 05/05/2017 01:50 PM, Richard Biener wrote:
>>>> On Thu, May 4, 2017 at 1:10 PM, Martin Liška <mli...@suse.cz> wrote:
>>>>> On 05/04/2017 12:40 PM, Richard Biener wrote:
>>>>>>
>>>>>> On Thu, May 4, 2017 at 11:22 AM, Martin Liška <mli...@suse.cz> wrote:
>>>>>>>
>>>>>>> On 05/03/2017 12:12 PM, Richard Biener wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, May 3, 2017 at 10:10 AM, Martin Liška <mli...@suse.cz> wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Hello
>>>>>>>>>
>>>>>>>>> Last release cycle I spent quite some time with reading of IVOPTS pass
>>>>>>>>> dump file. Using -fdump*-details causes to generate a lot of 'Applying
>>>>>>>>> pattern'
>>>>>>>>> lines, which can make reading of a dump file more complicated.
>>>>>>>>>
>>>>>>>>> There are stats for tramp3d with -O2 and -fdump-tree-all-details.
>>>>>>>>> Percentage number
>>>>>>>>> shows how many lines are of the aforementioned pattern:
>>>>>>>>>
>>>>>>>>>                         tramp3d-v4.cpp.164t.ivopts: 6.34%
>>>>>>>>>                           tramp3d-v4.cpp.091t.ccp2: 5.04%
>>>>>>>>>                       tramp3d-v4.cpp.093t.cunrolli: 4.41%
>>>>>>>>>                       tramp3d-v4.cpp.129t.laddress: 3.70%
>>>>>>>>>                           tramp3d-v4.cpp.032t.ccp1: 2.31%
>>>>>>>>>                           tramp3d-v4.cpp.038t.evrp: 1.90%
>>>>>>>>>                      tramp3d-v4.cpp.033t.forwprop1: 1.74%
>>>>>>>>>                           tramp3d-v4.cpp.103t.vrp1: 1.52%
>>>>>>>>>                      tramp3d-v4.cpp.124t.forwprop3: 1.31%
>>>>>>>>>                           tramp3d-v4.cpp.181t.vrp2: 1.30%
>>>>>>>>>                        tramp3d-v4.cpp.161t.cunroll: 1.22%
>>>>>>>>>                     tramp3d-v4.cpp.027t.fixup_cfg3: 1.11%
>>>>>>>>>                        tramp3d-v4.cpp.153t.ivcanon: 1.07%
>>>>>>>>>                           tramp3d-v4.cpp.126t.ccp3: 0.96%
>>>>>>>>>                           tramp3d-v4.cpp.143t.sccp: 0.91%
>>>>>>>>>                      tramp3d-v4.cpp.185t.forwprop4: 0.82%
>>>>>>>>>                            tramp3d-v4.cpp.011t.cfg: 0.74%
>>>>>>>>>                      tramp3d-v4.cpp.096t.forwprop2: 0.50%
>>>>>>>>>                     tramp3d-v4.cpp.019t.fixup_cfg1: 0.37%
>>>>>>>>>                      tramp3d-v4.cpp.120t.phicprop1: 0.33%
>>>>>>>>>                            tramp3d-v4.cpp.133t.pre: 0.32%
>>>>>>>>>                      tramp3d-v4.cpp.182t.phicprop2: 0.27%
>>>>>>>>>                     tramp3d-v4.cpp.170t.veclower21: 0.25%
>>>>>>>>>                        tramp3d-v4.cpp.029t.einline: 0.24%
>>>>>>>>>
>>>>>>>>> I'm suggesting to add new TDF that will be allocated for that.
>>>>>>>>> Patch can bootstrap on ppc64le-redhat-linux and survives regression
>>>>>>>>> tests.
>>>>>>>>>
>>>>>>>>> Thoughts?
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Ok.  Soon we'll want to change dump_flags to uint64_t ...  (we have 1
>>>>>>>> bit
>>>>>>>> left
>>>>>>>> if you allow negative dump_flags).  It'll tickle down on a lot of
>>>>>>>> interfaces
>>>>>>>> so introducing dump_flags_t at the same time might be a good idea.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Hello.
>>>>>>>
>>>>>>> I've prepared patch that migrates all interfaces and introduces
>>>>>>> dump_flags_t.
>>>>>>
>>>>>>
>>>>>> Great.
>>>>>>
>>>>>>> I've been
>>>>>>> currently testing that. Apart from that Richi requested to come up with
>>>>>>> more
>>>>>>> generic approach
>>>>>>> of hierarchical structure of options.
>>>>>>
>>>>>>
>>>>>> Didn't really "request" it, it's just something we eventually need to do
>>>>>> when
>>>>>> we run out of bits again ;)
>>>>>
>>>>>
>>>>> I know, but it was me who came up with the idea of more fine suboptions :)
>>>>>
>>>>>>
>>>>>>>
>>>>>>> Can you please take a look at self-contained source file that shows way
>>>>>>> I've
>>>>>>> decided to go?
>>>>>>> Another question is whether we want to implement also "aliases", where
>>>>>>> for
>>>>>>> instance
>>>>>>> current 'all' is equal to union of couple of suboptions?
>>>>>>
>>>>>>
>>>>>> Yeah, I think we do want -all-all-all and -foo-all to work.  Not sure
>>>>>> about -all-foo-all.
>>>>>
>>>>>
>>>>> Actually only having 'all' is quite easy to implement.
>>>>>
>>>>> Let's imagine following hierarchy:
>>>>>
>>>>> (root)
>>>>> - vops
>>>>> - folding
>>>>>   - gimple
>>>>>     - ctor
>>>>>     - array_ref
>>>>>     - arithmetic
>>>>>   - generic
>>>>>     - c
>>>>>     - c++
>>>>>     - ctor
>>>>>     - xyz
>>>>>
>>>>> Then '-fdump-passname-folding-all' will be equal to
>>>>> '-fdump-passname-folding'.
>>>>
>>>> Ok, so you envision that sub-options restrict stuff.  I thought of
>>>>
>>>>  -gimple
>>>>    -vops
>>>>  -generic
>>>>    -folding
>>>>
>>>> so the other way around.  We do not have many options that would be RTL
>>>> specific but gimple only are -vops -alias -scev -gimple -rhs-only
>>>> -verbose -memsyms
>>>> while RTL has -cselib. -eh sounds gimple specific.  Then there's the 
>>>> optgroup
>>>> stuff you already saw.
>>>>
>>>> So it looks like a 8 bit "group id" plus 56 bits of flags would do.
>>>>
>>>> Yes, this implies reworking how & and | work.  For example you can't
>>>> | dump-flags of different groups.
>>>
>>> Well, I'm not opposed to idea of converting that to way you described.
>>> So, you're willing to introduce something like:
>>>
>>> (root)
>>> - generic
>>>   - eh
>>>   - folding
>>>   - ...
>>> - gimple
>>>   - vops
>>>   - folding
>>>    - rhs-only
>>>    - ...
>>>   - vops
>>> - rtl
>>>   - cselib
>>>   - ...
>>>
>>> ?
>>
>> Yeah.  As said the motivation was to escape the 32 (now 64) bits limitation,
>> not to make the user interface into a hierarchy.
>>
>> I suppose we can easily defer now given we have 32 bits available now ;)
>
> I see. Hopefully we can live quite some time with another 32 bits and I'm 
> going
> to transform the TDF_* stuff to enum.


Maybe another .def file with embedded docs ;)

Ok, no, I didn't suggest this.

/me runs...

> Martin
>
>>
>> Richard.
>>
>>>>
>>>>>>
>>>>>> The important thing is to make sure dump_flags_t stays POD and thus is
>>>>>> eligible to be passed in register(s).  In the end we might simply come up
>>>>>> with a two-level hierarchy, each 32bits (or we can even get back to 
>>>>>> 32bits
>>>>>> in total with two times 16bits).
>>>>>
>>>>>
>>>>> I'm aware of having the type as POD.
>>>>>
>>>>>>
>>>>>> It looks you didn't actually implement this as a hierarchy though but
>>>>>> still allocate from one pool of bits (so you only do a change to how
>>>>>> users access this?)
>>>>>
>>>>>
>>>>> Yep, all leaf options are mapped to a mask and all inner nodes are just
>>>>> union
>>>>> of suboptions. That will allow us to have 64 leaf suboptions. Reaching the
>>>>> limit
>>>>> we can encode the values in more sophisticated way. That however brings 
>>>>> need
>>>>> to implement more complicated '&' and '|' operators.
>>>>>
>>>>> I'll finish the implementation and try to migrate that to current 
>>>>> handling.
>>>>> Guess, I'm quite close.
>>>>
>>>> Hmm, but then there's not much advantage in suboptions (well, apart from 
>>>> maybe
>>>> at the user-side).
>>>
>>> Yep, please take a look at updated version of PATCH 2/N, where I ported 
>>> -fopt-info.
>>> As you can see I had to explicitly define all enum values and hierarchy 
>>> creation
>>> of every single node.
>>>
>>> Martin
>>>
>>>>
>>>>> Martin
>>>>>
>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> Richard.
>>>>>>
>>>>>>>
>>>>>>> Thanks for feedback,
>>>>>>> Martin
>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Richard.
>>>>>>>>
>>>>>>>>> Martin
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>
>>>
>

Re: [RFC][PATCH] Introduce -fdump*-folding

Reply via email to