On Thu, 2018-09-27 at 09:46 +0200, Martin Liška wrote: > Hi. > > For some time we've been providing an intermediate format as > output of GCOV tool. It's documented in our code that primary > consumer of it is lcov. Apparently that's not true: > https://github.com/linux-test-project/lcov/issues/38#issuecomment-371 > 203750 > > So that I decided to come up with providing the very similar > intermediate > format in JSON format. It's much easier for consumers to work with. > > I'm planning to leave the legacy format for GCC 9.1 and I'll document > that > it's deprecated. We can then remove that in next release. > > The patch contains a small change to json.{h,cc}, hope David can > approve that? > Patch is tested on x86_64-linux-gnu.
I'm not officially a reviewer for the json stuff, but I can comment, at least. The changes to json.h/cc are fine by me, FWIW. Some high-level observations: * how big are the JSON files? One of the comments at my Cauldron talk on optimization records was a concern that the output could be very large. The JSON files compress really well, so maybe this patch should gzip on output? Though I don't know if it's appropriate for this case. iirc, gfortran's module code has an example of gzipping a pretty_printer buffer. * json::object::print doesn't preserve the insertion order of its name/value pairs; they're written out in whatever order the hashing leads to (maybe we should fix that though). The top-level JSON value in your file format is currently a JSON object containing "version" metadata etc. There's thus a chance that could be at the end, after the data. Perhaps the top-level JSON value should be an array instead (as a "tuple"), to guarantee that the versioning metadata occurs at the start? Or are consumers assumed to read the whole file into memory and traverse it, tree-like? * Similar to the optimization records code, this patch is building a tree of dynamically-allocated JSON objects in memory, and then printing it to a pretty_printer, and flushing the pp's buffer to a FILE *. This is simple and flexible, but I suspect that I may need to rewrite this for the optimization records case to avoid bloating the memory usage (e.g. to eliminate the in-memory JSON tree in favor of printing as we go). Would that concern apply here, or is the pattern OK? * FWIW I'm working on DejaGnu test coverage for JSON output, but it's ugly - a nasty shim between .exp and (optional) .py scripts for parsing the JSON and verifying properties of it, sending the results back in a form that DejaGnu can digest and integrate into the .sum/.log files. I hope to post it before stage 1 closes (I'd much prefer to have good test coverage in place before optimizing how this stuff gets written) [...snip...] > diff --git a/gcc/gcov.c b/gcc/gcov.c > index e255e4e3922..39d9329d6d0 100644 > --- a/gcc/gcov.c > +++ b/gcc/gcov.c [...snip...] > @@ -1346,6 +1481,15 @@ generate_results (const char *file_name) > } > } > > + json::object *root = new json::object (); [...snip...] > > - if (flag_gcov_file && flag_intermediate_format && !flag_use_stdout) > + if (flag_gcov_file && flag_json_format) > + root->dump (out); > + It looks like "root" gets leaked here (and thus all of the objects in the JSON tree also). I don't know if that's a problem, but an easy fix would be to make "root" be an on-stack object, rather than allocated on the heap - this ought to lead everything to be cleaned up when root's dtor runs. Dave