On 9/27/18 3:55 PM, David Malcolm wrote: > On Thu, 2018-09-27 at 09:46 +0200, Martin Liška wrote: >> Hi. >> >> For some time we've been providing an intermediate format as >> output of GCOV tool. It's documented in our code that primary >> consumer of it is lcov. Apparently that's not true: >> https://github.com/linux-test-project/lcov/issues/38#issuecomment-371 >> 203750 >> >> So that I decided to come up with providing the very similar >> intermediate >> format in JSON format. It's much easier for consumers to work with. >> >> I'm planning to leave the legacy format for GCC 9.1 and I'll document >> that >> it's deprecated. We can then remove that in next release. >> >> The patch contains a small change to json.{h,cc}, hope David can >> approve that? >> Patch is tested on x86_64-linux-gnu. > > I'm not officially a reviewer for the json stuff, but I can comment, at > least. The changes to json.h/cc are fine by me, FWIW.
Hello. Appreciate your feedback! > > Some high-level observations: > * how big are the JSON files? One of the comments at my Cauldron talk > on optimization records was a concern that the output could be very > large. The JSON files compress really well, so maybe this patch should > gzip on output? Though I don't know if it's appropriate for this case. > iirc, gfortran's module code has an example of gzipping a > pretty_printer buffer. Probably sounds reasonable, for tramp3d I get: 5.8M, where original intermediate format is 1.5M big. Gzipped JSON format is 216K. I'll update the patch to address that. > > * json::object::print doesn't preserve the insertion order of its > name/value pairs; they're written out in whatever order the hashing > leads to (maybe we should fix that though). The top-level JSON value > in your file format is currently a JSON object containing "version" > metadata etc. There's thus a chance that could be at the end, after > the data. Perhaps the top-level JSON value should be an array instead > (as a "tuple"), to guarantee that the versioning metadata occurs at the > start? Or are consumers assumed to read the whole file into memory and > traverse it, tree-like? Well, it's not nice to do list from something that is object. It's micro-optimization in my opinion. On the other hand I would preserve the order of keys as they were added into an object. Moreover, we can have option for that: https://docs.python.org/3/library/json.html#basic-usage Search for sort_keys. Similarly I would appreciate 'indent' option, it's handy for debugging. > > * Similar to the optimization records code, this patch is building a > tree of dynamically-allocated JSON objects in memory, and then printing > it to a pretty_printer, and flushing the pp's buffer to a FILE *. This > is simple and flexible, but I suspect that I may need to rewrite this > for the optimization records case to avoid bloating the memory usage > (e.g. to eliminate the in-memory JSON tree in favor of printing as we > go). Would that concern apply here, or is the pattern OK? One should not bloat memory in GCOV I guess. > > * FWIW I'm working on DejaGnu test coverage for JSON output, but it's > ugly - a nasty shim between .exp and (optional) .py scripts for parsing > the JSON and verifying properties of it, sending the results back in a > form that DejaGnu can digest and integrate into the .sum/.log files. I > hope to post it before stage 1 closes (I'd much prefer to have good > test coverage in place before optimizing how this stuff gets written) Would be good, then I would definitely add a test for GCOV JSON format. > > [...snip...] >> diff --git a/gcc/gcov.c b/gcc/gcov.c >> index e255e4e3922..39d9329d6d0 100644 >> --- a/gcc/gcov.c >> +++ b/gcc/gcov.c > > [...snip...] > >> @@ -1346,6 +1481,15 @@ generate_results (const char *file_name) >> } >> } >> >> + json::object *root = new json::object (); > > [...snip...] >> >> - if (flag_gcov_file && flag_intermediate_format && !flag_use_stdout) >> + if (flag_gcov_file && flag_json_format) >> + root->dump (out); >> + > > It looks like "root" gets leaked here (and thus all of the objects in > the JSON tree also). I don't know if that's a problem, but an easy > fix would be to make "root" be an on-stack object, rather than > allocated on the heap - this ought to lead everything to be cleaned up > when root's dtor runs. I've just tried valgrind and I can't see any leak in current implementation? Martin > > Dave >