On Wed, 2024-09-11 at 20:49 -0500, tcpreimesber...@gmail.com wrote: > From: Thor C Preimesberger <tcpreimesber...@gmail.com> > > This patch allows the compiler to dump GENERIC trees as JSON objects. > > The dump flag -fdump-tree-original-json dumps each fndecl node in the > C frontend's gimplifier as a JSON object and traverses related nodes > in an analagous manner as to raw-dumping.
Thanks for posting this patch. Are you able to upload somewhere some examples of what the dumps look like? Some high level thoughts: * the patch uses "dummy" throughout as a variable name. To me the name "dummy" suggests something unimportant that we had to give a name to, or something that we'd prefer didn't exist but had to create. However in most(all?) cases "dummy" seems to refer to the json object being created or having properties added to it, and thus the most interesting thing in the function. I suspect that renaming "dummy" to "js_obj" or "json_obj" throughout would be an improvement in readability in terms of capturing the intent of the code (assuming that all of them are indeed json objects). * I think the code is leaking memory for all of the json values created - there are lots of uses of "naked new" in this code, but I don't see any uses of "delete". For example, in > +void > +dump_node_json (const_tree t, dump_flags_t flags, FILE *stream) > +{ > + struct dump_info di; > + dump_queue_p dq; > + dump_queue_p next_dq; > + pretty_printer pp; > + /* Initialize the dump-information structure. */ > + di.stream = stream; > + di.index = 0; > + di.column = 0; > + di.queue = 0; > + di.queue_end = 0; > + di.free_list = 0; > + di.flags = flags; > + di.node = t; > + di.nodes = splay_tree_new (splay_tree_compare_pointers, 0, > + splay_tree_delete_pointers); > + di.json_dump = new json::array (); ^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^ allocated with naked new here > + /* Queue up the first node. */ > + queue (&di, t); > + > + /* Until the queue is empty, keep dumping nodes. */ > + while (di.queue) > + dequeue_and_dump (&di); > + > + di.json_dump->dump(stream, true); > + fputs("\n", stream); > + /* Now, clean up. */ > + for (dq = di.free_list; dq; dq = next_dq) > + { > + next_dq = dq->next; > + free (dq); > + } > + splay_tree_delete (di.nodes); and di.json_dump goes out of scope here and is leaked, I think. So I *think* all of the json values being created during dumping are being leaked. > +} Similarly, in: > +DEBUG_FUNCTION void > +debug_tree_json (tree t) > +{ > + json::object* _x = node_emit_json(t); > + _x->dump(stderr, true); > + fprintf(stderr, "\n"); > +} if I'm reading things right, node_emit_json doesn't "emit" json so much as create a new json::object on the heap via "new", and when "_x" goes out of scope, it's leaked. The pattern in the code seems to be that node_emit_json creates a new json::object and populates it with properties (sometimes recursively). Given that, and that we can use C++11, I recommend using std::unique_ptr<json::object> for it, to capture the intent that this is a heap-allocated pointer with responsibility for being "delete"-d at some point. That way, rather that: json::object* node_emit_json(tree t) { tree op0, op1, type; enum tree_code code; expanded_location xloc; json::object *dummy; json::array* holder; char address_buffer[sizeof(&t)] = {"\0"}; dummy = new json::object (); holder = new json::array (); [...snip...] return dummy; } we could have (perhaps renaming to "node_to_json"): std::unique_ptr<json::object> node_to_json(tree t) { tree op0, op1, type; enum tree_code code; expanded_location xloc; char address_buffer[sizeof(&t)] = {"\0"}; auto js_obj = ::make_unique<json::object> (); // to implicitly use std::unique_ptr<json::object> auto holder = ::make_unique<json::array> (); // likewise std::unique_ptr<json::array> [...snip...] return js_obj; } ...assuming that I'm correctly understanding the ownership of the json values in the patch. Note that we have to use ::make_unique from our make-unique.h, rather than std::make_unique from <memory> since the latter was only added in C++14. Many of our data structures don't properly handle objects with destructors, and I suspect splay_tree is one of these. You can use js_obj.release () to transfer ownership to such data structures, and will (probably) need to manually use "delete" on the pointers in the right places. What happens to "holder" in that function? It seems to get populated with json objects for the various tree nodes found recursively, but then it seems to simply be leaked (or populated then auto-deleted, if we use std::unique_ptr>. Or am I missing something? In case it's helpful, a couple of months ago I converted the SARIF output code from using "naked" json pointers to using std::unique_ptr in: https://gcc.gnu.org/pipermail/gcc-patches/2024-July/658204.html and I found it helped a *lot* with documenting ownership and avoiding leaks. You might find other things of interest in the first half of this patch kit: https://gcc.gnu.org/pipermail/gcc-patches/2024-July/658194.html FWIW in that code I'm also using a class hierarchy of json::object subclasses to help with compliance with a JSON schema (sarif_object is a subclass of json::object), but given that there probably isn't a schema for the JSON in TDF_JSON dumps, that's probably not relevant to this case. FWIW you can test for leaks by running the compiler under valgrind by configuring with --enable-valgrind-annotations and appending "-wrapper valgrind" to the command line (or "-wrapper valgrind,--leak-check=full" IIRC). Thanks again for the patch; hope this is constructive. Dave