On Sun, Feb 25, 2018 at 10:46 AM, Martin Jambor <mjam...@suse.cz> wrote: > Hello Hrishikesh, > > I apologize for replying to you this late, this has been a busy week > and now I am traveling. > > On Mon, Feb 19 2018, Hrishikesh Kulkarni wrote: >> Hi, >> >> I am Hrishikesh Kulkarni currently studying as an undergrad student in >> Computer Engineering at Pune University, India. I find compilers quite >> interesting as a subject, and would like to apply to GSoC to gain some >> understanding of how real-world compilers work. So far, I have managed to >> build gcc and perform some simple tweaks to the codebase. In particular, I >> would like to apply to the Textual LTO dump tool project. >> > > I must say I am impressed by the research you have already done. > Nevertheless, please note that Ray Kim has also expressed interest in > the project. Martin Liska will be the mentor, so I will let him drive > the selection process. On the other hand, Ray also liked another > project, so maybe he will pick that and everyone will be happy. > >> As far as I understand, the motivation for LTO framework was to enable >> cross file interprocedural optimizations, and for this purpose an ipa pass >> is divided into following three stages: >> >> 1. >> >> LGEN: The pass does a local analysis of the function and generates a >> “summary”, ie, the information relevant to the pass and writes it to LTO >> object file. > > A pass might do that, but the output of the whole stage is not just the > pass summaries, it also writes the function IL (the function gimple > statements, above all) to the object file. > >> 2. >> >> WPA: The LTO object files are given as input to the linker, which then >> invokes the lto1 frontend to perform global ipa analysis over the >> call-graph and write optimized summaries to LTO object files >> (partitioning). The global ipa analysis is done over summary and not the >> actual function bodies. > > Well... note that partitioning actually means dividing the whole > compiled program/library into chunks that are then compiled > independently in the LTRANS stage. But you are basically right that WPA > does also do whole-program analysis based on summaries and then writes > its decisions to optimization summaries, yes. > >> 3. > >> >> LTRANS: The partitions are read back, and the function bodies are >> reconstructed from summary and are then compiled to produce real object >> files. > > Function bodies and the summaries are distinct things. The body > consists of gimple statements and all the associated stuff (such as > types, so a lot of stuff), whereas when we refer to summaries, we mean > small chunks of data that interprocedural optimizations such as inlining > or IPA-CP scurry away because they cannot feasibly work on bodies of the > entire program. > > But apart from this terminology issue, you are basically correct, at the > LTRANS stage, IPA passes apply transformations to the bodies according > to the optimization summary generated by the WPA phase. And then, all > normal, intra-procedural passes and code generation runs. > >> >> >> If I understand correctly, the motivation for textual LTO dump tool is to >> easily analyze contents of LTO object file, similar to readelf or objdump ? > > That is how I understand it too, but Martin may have some further uses > in mind. > >> >> Assume that LTO object file contains in pureconst section: 0b0110 (0b for >> binary prefix) corresponding to values of fs->pure_const_state and >> fs->state_previously_known. >> >> If I understand correctly, the output of dump tool should then be: >> >> pure_const pass: >> >> pure_const_state = IPA_PURE (enum value of pure_const_state_e corresponding >> to 0b01) >> >> state_previously_known = IPA_NEITHER (enum value of pure_const_state_e >> corresponding to 0b10) >> >> Is this the expected output of the dump tool ? > > I think the tool would have to a bit more than just dumping summaries of > IPA passes. I tend to think that the task should also include dumping > gimple bodies (but we already do that in GCC and so it should be mostly > easy) and also of types (that are merged as one of the first steps of > WPA and interesting things happen when mergingit does something > "interesting"). And perhaps quite a bit more. Martin? > >> >> I am reasonably familiar working with C, C++ and python. My prior >> experience includes opportunities to work in areas of NLP. Some of my >> accomplishments in the area include presenting project VicharDhara- A >> thought Mapper that was selected among top five ideas in Accenture >> Innovation Challenge among 7000 nationwide entries. My paper on this topic >> won the best paper award in IEEE Conference ICCUBEA-2017. My previous work >> was focused on simple parsers, student psychology, thought process >> detection for team selection. > > Interesting, congratulations. > >> >> In the interim, I have been through a few docs on GCC and LTO [1][2][3] and >> am trying to write a toy ipa pass to better understand LTO/IPA >> infrastructure. > > Great, I believe that's exactly what my advice would be > >> I would be grateful for feedback on the textual LTO dump >> tool. > > I hope that Martin will shed a bit more light on what output he > envisions the tool to have. I will talk to him about it too when I get > back to the office (so maybe on Tuesday but probably on Wednesday).
See also the mail I responded to the other candidates questions to this project. I belive a first step would be to provide a "driver" aka lto-dump that links enough of GCC objects itself to be able to LTO-input a single LTO object. From there details like GIMPLE bodies can be dumped with standard GCC facilities. Richard. > Thanks, > > Martin > > > >> >> [1] http://www.ucw.cz/~hubicka/slides/labs2013.pdf >> >> [2] https://gcc.gnu.org/wiki/LinkTimeOptimizatio >> <https://gcc.gnu.org/wiki/LinkTimeOptimization> >> >> [3] https://gcc.gnu.org/onlinedocs/gccint/LTO-Overview.html >> >> My two recent publications are listed below: >> >> [A] Hrishikesh Kulkarni, "Contextual Data Representation Using Prime Number >> Route Mapping Method and Ontology" IEEE Conference, ICCUBEA, 2017 >> >> [B] Hrishikesh Kulkarni, “Multi-Graph based Intent Hierarchy Generation to >> Determine Action Sequence”, Springer Conference, ICDECT, December 2017, Pune >> >> Thanks, >> >> Hrishikesh Kulkarni