Re: GSOC 2018 - Textual LTO dump tool project

Richard Biener Tue, 27 Feb 2018 05:27:52 -0800

On Sun, Feb 25, 2018 at 10:46 AM, Martin Jambor <mjam...@suse.cz> wrote:
> Hello Hrishikesh,
>
> I apologize for replying to you this late, this has been a busy week
> and now I am traveling.
>
> On Mon, Feb 19 2018, Hrishikesh Kulkarni wrote:
>> Hi,
>>
>> I am Hrishikesh Kulkarni currently studying as an undergrad student in
>> Computer Engineering at Pune University, India. I find compilers quite
>> interesting as a subject,  and would like to apply to GSoC to gain some
>> understanding of how real-world compilers work. So far, I have managed to
>> build gcc and perform some simple tweaks to the codebase. In particular, I
>> would like to apply to the Textual LTO dump tool project.
>>
>
> I must say I am impressed by the research you have already done.
> Nevertheless, please note that Ray Kim has also expressed interest in
> the project.  Martin Liska will be the mentor, so I will let him drive
> the selection process.  On the other hand, Ray also liked another
> project, so maybe he will pick that and everyone will be happy.
>
>> As far as I understand, the motivation for LTO framework was to enable
>> cross file interprocedural optimizations, and for this purpose an ipa pass
>> is divided into following three stages:
>>
>>    1.
>>
>>    LGEN: The pass does a local analysis of the function and generates a
>>    “summary”, ie, the information relevant to the pass and writes it to LTO
>>    object file.
>
> A pass might do that, but the output of the whole stage is not just the
> pass summaries, it also writes the function IL (the function gimple
> statements, above all) to the object file.
>
>>    2.
>>
>>    WPA: The LTO object files are given as input to the linker, which then
>>    invokes the lto1 frontend to perform global ipa analysis over the
>>    call-graph and write optimized summaries to LTO object files
>>    (partitioning). The global ipa analysis is done over summary and not the
>>    actual function bodies.
>
> Well... note that partitioning actually means dividing the whole
> compiled program/library into chunks that are then compiled
> independently in the LTRANS stage.  But you are basically right that WPA
> does also do whole-program analysis based on summaries and then writes
> its decisions to optimization summaries, yes.
>
>>    3.
>
>>
>>    LTRANS: The partitions are read back, and the function bodies are
>>    reconstructed from summary and are then compiled to produce real object
>>    files.
>
> Function bodies and the summaries are distinct things.  The body
> consists of gimple statements and all the associated stuff (such as
> types, so a lot of stuff), whereas when we refer to summaries, we mean
> small chunks of data that interprocedural optimizations such as inlining
> or IPA-CP scurry away because they cannot feasibly work on bodies of the
> entire program.
>
> But apart from this terminology issue, you are basically correct, at the
> LTRANS stage, IPA passes apply transformations to the bodies according
> to the optimization summary generated by the WPA phase.  And then, all
> normal, intra-procedural passes and code generation runs.
>
>>
>>
>> If I understand correctly, the motivation for textual LTO dump tool is to
>> easily analyze contents of LTO object file, similar to readelf or objdump ?
>
> That is how I understand it too, but Martin may have some further uses
> in mind.
>
>>
>> Assume that LTO object file contains in pureconst section: 0b0110 (0b for
>> binary prefix) corresponding to values of fs->pure_const_state and
>> fs->state_previously_known.
>>
>> If I understand correctly, the output of dump tool should then be:
>>
>> pure_const pass:
>>
>> pure_const_state = IPA_PURE (enum value of pure_const_state_e corresponding
>> to 0b01)
>>
>> state_previously_known = IPA_NEITHER (enum value of pure_const_state_e
>> corresponding to 0b10)
>>
>> Is this the expected output of the dump tool ?
>
> I think the tool would have to a bit more than just dumping summaries of
> IPA passes.  I tend to think that the task should also include dumping
> gimple bodies (but we already do that in GCC and so it should be mostly
> easy) and also of types (that are merged as one of the first steps of
> WPA and interesting things happen when mergingit does something
> "interesting").  And perhaps quite a bit more.  Martin?
>
>>
>> I am reasonably familiar working with C, C++ and python. My prior
>> experience includes opportunities to work in areas of NLP. Some of my
>> accomplishments in the area include presenting project VicharDhara- A
>> thought Mapper that was selected among top five ideas in Accenture
>> Innovation Challenge among 7000 nationwide entries. My paper on this topic
>> won the best paper award in IEEE Conference ICCUBEA-2017. My previous work
>> was focused on simple parsers, student psychology, thought process
>> detection for team selection.
>
> Interesting, congratulations.
>
>>
>> In the interim, I have been through a few docs on GCC and LTO [1][2][3] and
>> am trying to write a toy ipa pass to better understand LTO/IPA
>> infrastructure.
>
> Great, I believe that's exactly what my advice would be
>
>> I would be grateful for feedback on the textual LTO dump
>> tool.
>
> I hope that Martin will shed a bit more light on what output he
> envisions the tool to have.  I will talk to him about it too when I get
> back to the office (so maybe on Tuesday but probably on Wednesday).


See also the mail I responded to the other candidates questions to this
project.

I belive a first step would be to provide a "driver" aka lto-dump that
links enough of GCC objects itself to be able to LTO-input a single
LTO object.  From there details like GIMPLE bodies can be dumped
with standard GCC facilities.

Richard.

> Thanks,
>
> Martin
>
>
>
>>
>> [1] http://www.ucw.cz/~hubicka/slides/labs2013.pdf
>>
>> [2] https://gcc.gnu.org/wiki/LinkTimeOptimizatio
>> <https://gcc.gnu.org/wiki/LinkTimeOptimization>
>>
>> [3] https://gcc.gnu.org/onlinedocs/gccint/LTO-Overview.html
>>
>> My two recent publications are listed below:
>>
>> [A] Hrishikesh Kulkarni, "Contextual Data Representation Using Prime Number
>> Route Mapping Method and Ontology" IEEE Conference, ICCUBEA, 2017
>>
>> [B] Hrishikesh Kulkarni, “Multi-Graph based Intent Hierarchy Generation to
>> Determine Action Sequence”, Springer Conference, ICDECT, December 2017, Pune
>>
>> Thanks,
>>
>> Hrishikesh Kulkarni

Re: GSOC 2018 - Textual LTO dump tool project

Reply via email to