On Wed, Feb 28, 2024 at 4:14 PM David Malcolm <dmalc...@redhat.com> wrote: > > On Wed, 2024-02-28 at 08:58 +0100, Richard Biener wrote: > > On Tue, Feb 27, 2024 at 10:20 PM Robert Dubner <rdub...@symas.com> > > wrote: > > > > > > Richard, > > > > > > Thank you very much for your comments. > > > > > > When I set out to create the capability, I had a "specification" in > > > mind. > > > > > > I didn't have a clue how to create a GENERIC tree that could be fed > > > to the > > > middle end in a way that would successfully result in an > > > executable. And I > > > needed to be able to do that in order to proceed with the project > > > of > > > creating a COBOL front end. > > > > > > So, I came up with the idea of using GCC to compile simple > > > programs, and to > > > hook into the compiler to examine the trees fed to the middle end, > > > and to > > > display those trees in the human-readable format I needed to > > > understand > > > them. And that's what I did. > > > > > > My first incarnation generated pure text files, and I used that to > > > get > > > going. > > > > > > After a while I realized that when I used the output file, I was > > > spending a > > > lot of time searching through the text files. And I had the > > > brainstorm! > > > Hyperlinks! HTML files! We have the technology! So, I created > > > the .HTML > > > files as well. > > > > > > I found this useful to the point of necessity in order to learn how > > > to > > > generate the GENERIC trees. I believe it would be equally useful > > > to the > > > next developer who, for whatever reason, needs to understand, on a > > > "You need > > > to learn the alphabet before you can learn how to read" level, what > > > the > > > middle end requires from a GENERIC tree generated by a front end. > > > > > > But I've never used it on a complex program. I've used it only to > > > learn how > > > to create the GENERIC nodes for very particular things, and so I > > > would use > > > the -fdump-generic-nodes feature on a very simple C program that > > > demonstrated, in isolation, the feature I needed. Once I figured > > > it out, I > > > would create front end C routines or macros that used the > > > tree.h/tree.cc > > > features to build those GENERIC trees, and then I would move on. > > > > > > I decided to offer it up here, in order to to learn how to create > > > patches > > > and to get > > > to know the people and the process, as well as from the desire to > > > share it. > > > And instantly I got the "How about a machine-readable format?" > > > comments. > > > Which are reasonable. So, because it wasn't hard, I hacked at the > > > existing > > > code to create a JSON output. (But I remind you that up until now, > > > nobody > > > seems to have needed a JSON representation.) > > > > > > And your observation that the human readable representation could > > > be made > > > from the JSON representation is totally accurate. > > > > > > But that wasn't my specification. My specification was "A tool so > > > that a > > > human being can examine a simple GENERIC tree to learn how it's > > > done." > > > > > > But it seems to me that we are now moving into the realm of a new > > > specification. > > > > > > Said another way: To go from "A human readable representation of a > > > simple > > > GENERIC tree" to "A machine readable JSON representation of an > > > arbitrarily > > > complex GENERIC tree, from which a human readable representation > > > can be > > > created" means, in effect, starting over on a different project > > > that I don't > > > need. I already *have* a project that I am working on -- the COBOL > > > front > > > end. > > > > > > The complexity of GENERIC trees is, in my experienced opinion, an > > > obstacle > > > for the creation of front ends. The GCC Internals document has a > > > lot of > > > information, but to go from it to a front end is like using the > > > maintenance > > > manual for an F16 fighter to try to learn to fly the aircraft. > > > > > > The program "main(){}" generates a tree with over seventy nodes. I > > > see no > > > way to document why that's true; it's all arbitrary in the sense > > > that "this > > > is how GCC works". -fdump-generic-nodes made it possible for me to > > > figure > > > out how those nodes are connected and, thus, how to create a new > > > front end. > > > I figure that other developers might find it useful, as well. > > > > > > I guess I am saying that I am not, at this time, able to work on a > > > whole > > > different tool. I think what I have done so far does something > > > useful that > > > doesn't seem to otherwise exist in GCC. > > > > > > I suppose the question for you is, "Is it useful enough?" > > > > > > I won't be offended if the answer is "No" and I hope you won't be > > > offended > > > by my not having the bandwidth to address your very thoughtful and > > > valid > > > observations about how it could be better. > > > > No offense taken - I did realize how useful this was to you (and > > specifically > > the hyper-linking looked even very useful to me!). I often lament > > the lack > > of domain-specific visualization tools for the various data > > structures GCC > > has - having something for GENERIC would be very welcome. > > > > We have for example ways to dump graphviz .dot format graphs of the > > CFG > > and some other data structures and do that natively, not via JSON > > indirection. > > FWIW for GCC 15 I've been experimenting with adding a > text_art::tree_widget class; with that, the analyzer can visualize an > ana::program_state instance like this (potentially with colorization in > suitable terminals): > > State > ├─ Region Model > │ ├─ Current Frame: frame: ‘test_7’@1 > │ ├─ Store > │ │ ╰─ root region > │ │ ╰─ (*INIT_VAL(a_10(D))) > │ │ ╰─ bytes 12-15: ‘int’ {(int)42} > │ ╰─ Constraints > │ ├─ Equivalence class ec0 > │ │ ├─ (void *)0B > │ │ ╰─ ‘0B’ > │ ├─ Equivalence class ec1 > │ │ ╰─ INIT_VAL(a_10(D)) > │ ├─ Equivalence class ec2 > │ │ ╰─ (INIT_VAL(a_10(D))+(sizetype)12) > │ ├─ ec1: {INIT_VAL(a_10(D))} != ec0: {(void *)0B == [m_constant]‘0B’} > │ ╰─ ec2: {(INIT_VAL(a_10(D))+(sizetype)12)} != ec0: {(void *)0B == > [m_constant]‘0B’} > ╰─ ‘malloc’ state machine > ╰─ 0x62082e0: (INIT_VAL(a_10(D))+(sizetype)12): assumed-non-null (in > frame: ‘test_7’@1) > > and such visualizations could be added for other hierarchical data > structures. Also, because it uses text_art::widget, the content at a > tree node doesn't have to be purely textual, and we could do things > like the following (which is a mockup): > > State > ├─ Region Model Bound value │ Effective value > │ ├─ Stack │ > │ │ ├─ frame@0 'foo' │ > │ │ │ ├─ 'i' (int)42 │ > │ │ │ ╰─ 'j' │ UNINIT(int) > │ │ ╰─ frame@1 'bar' │ > │ │ ╰─ 'k' │ > │ │ ╰─ [3] │ > │ │ ├─ .x INIT_VAL(p) │ > │ │ ╰─ .y INIT_VAL(q) │ > │ ╰─ Globals │ > │ ╰─ 'baz' │ > ├─ Constraints > │ ╰─ (etc) > ╰─ 'malloc' state > ╰─ CONJURED_VALUE('ptr') unchecked('free') > > > That said, our "tree" structure is arguably a directed graph rather > than a tree (consider e.g. types) > > > > > Incidentially this looks like something fit for a google summer of > > code project. > > Ideally it would hook into print-tree.cc providing an alternate > > structured output. > > It currently prints in the style > > > > <function_decl 0x7ffff71bc600 bswap16 > > type <function_type 0x7ffff71ba5e8 > > type <integer_type 0x7ffff702b540 short unsigned int public > > unsigned HI > > size <integer_cst 0x7ffff702d108 constant 16> > > unit-size <integer_cst 0x7ffff702d120 constant 2> > > align:16 warn_if_not_align:0 symtab:0 alias-set -1 > > canonical-type 0x7ffff702b540 precision:16 min <integer_cst > > 0x7ffff702d138 0> max <integer_cst 0x7ffff702d0f0 65535>> > > QI > > size <integer_cst 0x7ffff702d048 constant 8> > > unit-size <integer_cst 0x7ffff702d060 constant 1> > > ... > > > > where you can see it follows tree -> tree edges up to some depth > > (and avoids repeated expansion). When debugging that's all I have > > and I have to follow edges by matching up the raw addresses printed, > > re-dumping those that didn't get expanded. HTML would be indeed > > so much nicer here (and a more complete output). > > > > From a maintainance point I think it's important to have "dump a tree > > node" > > once, so when fields are added or deemed useful for presenting in a > > dump > > you don't have to chase down more than one place. Maintenance is > > also > > the reason to not simply accept your contribution as-is. > > Presumably we'd want some kind of "visitor" code that captures visiting > the fields of each type in one place, allowing for different consumers > of this data (HTML generation, JSON generation etc). Though I'm not > sure how best to express this double-dispatch (by templates, vfunc, or > whatnot). > > We already have some of this in the form of gengtype and what it > generates, but I suspect we don't want to add further reliance on > gengtype. > > > > > I do hope this eventually gets picked up. I've added a project idea > > to https://gcc.gnu.org/wiki/SummerOfCode > > and would be willing to mentor it. > > FWIW you added it to the "Other projects and project ideas" below > "Other Project Ideas", where it's much less likely to get noticed than > under the "Selected Project Ideas".
Yeah, I didn't feel it was "Selected", but maybe that doesn't mean anything? > > > > Oh, and I'm looking forward to the actual Cobol work! > > Likewise > > [...snip...] > > Dave >