On Mon, 2025-02-17 at 12:42 -0500, James K. Lowden wrote: > On Sat, 15 Feb 2025 23:32:37 -0500 > David Malcolm <dmalc...@redhat.com> wrote: > > > > + free(copier); > > > > There?s a manual free of "copier" here, but there?s are various > > error- > > handling early returns paths that will leak. Maybe just use a > > std::string? > > > > Similarly with ?path?; I think this is always leaked. Maybe > > std::string here too. > > > > Have you tried running the compiler under valgrind? Configure with > > ?enable-valgrind-annotations and pass -wrap per=valgrind to the > > driver. > > It's no accident, comrade. ;-) > > My design criterion: the parser's memory requirements are linear with > the input. As grows the COBOL text, so grows the memory > consumption. Any more would be wrong; any less is pointless. > > The parser makes a single pass over the input. It can't "leak" > except in a loop. In general it therefore never calls free(3). Only > when a string is being built up of consecutive allocations in a loop > do we take any care with free. > > Before anyone suggests that's wasteful of memory, let's remind > ourselves of the compiler's design. The input text is transformed > into a GENERIC tree. The compiled program *must* fit in memory. > Much of that input needs to be retained as debug strings and supplied > values. > > We have tested gcobol on large COBOL inputs, hundreds of thousands of > lines. gcobol compiles those programs, as is, without freeing > memory, on virtual machines that can barely link cc1. > > In the instant case, when open_file() fails, the compiler will soon > terminate for lack of input. Code generation will not be engaged. > The lost strings are literally no concern at all. (I can see why > that might not be obvious on first reading, and I hope I don't sound > harsh or dismissive.) > > As for std::string, it would only complicate the front end. Most > strings in the parser are either fed back in to some C API or become > parts of GENERIC nodes. I admit having been tempted by > std::stringstream for concatenation but in the end asprintf did most > of what was needed. > > > Have you tried running the compiler under valgrind? Configure with > > ?enable-valgrind-annotations and pass -wrap per=valgrind to the > > driver. > > We have not tried that incantation, no. We used valgrind for > corruption problems, both times. ;-) We measured performance and > memory use with the excellent Linux perf tool. That is how we found > the astonishing problems with std::regex (and also with how we were > using it). > > I hope you see that we're taking care, but not too much care, with > memory. If there is a loop that is missing a free, we'll fix it, but > I bet it won't matter, because these are just string fragments for > filenames and such. To dutifully call free for every allocated scrap > of parsed input would be counterproductive: error-prone, and little > if anything saved.
Understood. The rest of the compiler is relatively "clean" w.r.t. valgrind, so when we do get leaks, valgrind shows them up clearly. So a benefit of my suggested approach is that if you *do* need to use valgrind at some point, it doesn't get swamped by noise from the frontend. But you can probably filter out all allocations originating from the frontend and get similar benefits. Dave