On Mon, Apr 1, 2024 at 6:23 PM Thor Preimesberger via Gcc <gcc@gcc.gnu.org> wrote: > > Hey all, > > I'm joining the group of people submitting their GSoC applications > over the holiday. I'm interested in the "Implement structured dumping > of GENERIC" project idea, and the application I've written is below.
Thank you for the interest in this project. > A quick question before though: > > - What would the expected use cases of the proposed > -fdump-generic-nodes option be, in addition to, presumably, writing > front ends into gcc? I think the main use case is better "visual" debugging and understanding of GENERIC. Then a structured dumping would also allow to custom processing like doing memory or other statistics. > I'm also curious about also targeting .gz/Graphviz; on a first > blush, it doesn't seem like too much additional work, and it may be > useful for the above applications. But I imagine there may be other > ways to process the data that would ultimately be more useful. Reading your message top-down I think that dumping in a structured format like JSON would allow targeting graphviz as postprocessing. > Best, > Thor Preimesberger > > -------------------------------- > > > Background: > > I'm an undergraduate student in pure mathematics who tinkers with > technology in his free time. I've taken an interest in compilers as of > last summer. I've written a couple of parsers, by hand, and a couple > of toy passes in LLVM. I'm currently working through the code > generation parts of the Dragon Book, in between my current course > work. I'm familiar with C and C++, and I'm currently taking courses > (on quantum information science, digital design, and computer > architecture) that focus on topics adjacent or tertiary to compiler > engineering. > In the mathematical part of my life, I mostly concentrate on > geometric analysis, and I've taken a few post graduate courses, on > Ricci flow and on variational/geometric partial differential > equations. These topics don't really capture all the mathematics I'm > interested in, and I don't think any of this academic work is directly > relevant to this project. But I hope that it conveys that I enjoy > deep, technical work that interpolates between multiple levels of > abstraction. > I believe compiler engineering shares this same aesthetic appeal. > This - and the pragmatic, altruistic nature of compiler engineering - > draws me to the field and to GCC in particular. > > > Expected Outcome: > - A patch in the main GCC repository that adds an additional dump > option (-fdump-generic-nodes) for the GENERIC intermediate > representation that preserves it's tree structure before it is lowered > to GIMPLE. We want to initially target JSON, and then provide a > human-readable translation into HTML. > > Additional features/visualizations are possible, but I would need > to discuss them with the mentor, Richard Biener. > > Timeline: > > Pre-GSoC > > I've already built gcc, with and without offloading, and have > successfully passed the tests under make-gcc. (Well, most of the tests > for the version of GCC with offloading - I understand that that is to > be expected.) I'm currently compiling some nontrivial programs to > various stages of completion, and toying with them in GDB and with > debug options. > > After this, I want to better understand the architecture of GCC > and it's intermediate representations. I would achieve this by reading > the documentation and dumping lots of code. > > Contemporaneously, I would want to at least understand, if not > possibly fix, a few items/bugs in the GCC Bugzilla. Here are two I > want to look at, picked wholly by individual interest: > > - https://gcc.gnu.org/bugzilla/show_bug.cgi?id=38833 > - https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97444 > > (I'm happy to take a look at any issues anyone recommends - > especially if they are more germane to the project than the above!) I don't remember any particular bugs around dumping of GENERIC but there are bugs tagged with the easyhack keyword. Personally I find memory-hog and compile-time hog issues rewarding to fix and at times interesting to understand (tiny) bits of GCC very well. > GSoC (Starting the week of May 27th, to August 26th) > > Community Bonding > > Understand the previously submitted patch in ( > https://gcc.gnu.org/pipermail/gcc-patches/2024-February/646295.html ) > Understand all the intended uses of the project > Scope out possible augmentations and begin researching them to the > project after discussing with mentor. > Continue patching effort, if it's not finished. > > > Weeks 1-4 > Begin working on minimal viable product (GENERIC to JSON, and JSON to HTML) > Finish scoping possible augmentations by week 4, > Begin development on any augmentations once approval is obtained > > Weeks 4 - 8 > Continue working on minimal viable product > > Weeks 8 - 13 > Complete minimal viable product if it is not finished. > Otherwise, begin working on possible augmentations as agreed upon with mentor > Wrap up documentation for any unfinished pieces This looks like a reasonable timeline and overall project structure. I probably pointed to it in my responses to the initial patch but to repeat here it would be very nice to integrate with the existing GENERIC dumping in tree-pretty-print.cc That's what you get when calling 'debug_tree (<node>)' from the inferior inside gdb. Implementation-wise the JSON target would then be a new dump flag (see the various TDF_* in dumpfiles.h). Note the deadline for uploading a proposal is today, please make sure to upload early, I think you can always amend the proposal later. Richard.