On Wed, Nov 26, 2014 at 09:49:52AM -0500, David Malcolm wrote: > On Wed, 2014-11-26 at 09:13 +0100, Uros Bizjak wrote: > > Hello! > > > > > cgraph*.c and ipa-*.c use xstrdup on strings when dumping them via > > > fprintf, leaking all of the duplicated buffers. > > > > > > Is/was there a reason for doing this? > > > > Yes, please see [1] and PR 53136 [2]. As said in [1]: > > > > "There is a problem with multiple calls of cgraph_node_name in fprintf > > dumps. Please note that C++ uses caching in > > cxx_printable_name_internal (aka LANG_HOOKS_DECL_PRINTABLE_NAME), so > > when cxx_printable_name_internal is called multiple times from printf > > (i.e. fprintf "%s/%i -> %s/%i"), it can happen that the first string > > gets evicted by the second call, before fprintf is fully evaluated." > > > > > Taking them out fixes these leaks (seen when dumping is enabled): > > > > But you will get "Invalid read of size X" instead. > > > > The patch at [1] fixed these, but introduced memory leaks, which were > > tolerable at the time: > > > > "I think that small memory leak is tolerable here (the changes are > > exclusively in the dump code), and follows the same approach as in > > java frontend." > > > > It seems that these assumptions are not valid anymore. > > > > [1] https://gcc.gnu.org/ml/gcc-patches/2012-04/msg01904.html > > [2] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53136 > > Aha! Thanks. > > I was only running under valgrind when testing the jit code, and the jit > code looks like a frontend to the rest of GCC, it doesn't provide an > implementation of LANG_HOOKS_DECL_PRINTABLE_NAME. I ran the rest of the > gcc testsuite outside of gcc. So perhaps the duplications of the > transient decl strings is still needed, and my patch would undo that > fix. Oops. > > I'm about to disappear for a US holiday, but maybe a good solution here > is to invent something like a "xstrdup_for_dump" function like this: > > /* When using fprintf (or similar), problems can arise with > transient generated strings. Many string-generation APIs > only support one result being alive at once (e.g. by > returning a pointer to a statically-allocated buffer). > > If there is more than one generated string within one > fprintf call: the first string gets evicted or overwritten > by the second, before fprintf is fully evaluated. > See e.g. PR/53136. > > This function provides a workaround for this, by providing > a simple way to create copies of these transient strings, > without the need to have explicit cleanup: > > fprintf (dumpfile, "string 1: %s string 2:%s\n", > xstrdup_for_dump (EXPR_1), > xstrdup_for_dump (EXPR_2)); > > The copied string is eventually freed, from code within > toplev::finalize. */ > > extern const char * > xstrdup_for_dump (const char *transient_str); > > We could then use this instead of xstrdup at these callsites, which > would document what's going on. > > (I'm not in love with the name, but hopefully the idea is clear). > > This could be implemented either: > > (A) in terms of the "long-term allocator" idea here: > https://gcc.gnu.org/ml/gcc-patches/2014-11/msg02420.html > which uses a obstack on the gcc::context, eventually freeing the > allocations from toplev::finalize (although that's currently only called > from libgccjit.so, but we could call it from cc1 etc when running under > valgrind). > > (B) or in terms of ggc_xstrdup, with the assumption that no GC can occur > in anything within a fprintf call. > > That would document what's going on, ensure that we don't have > use-after-free issues, and stop memory leak warnings from valgrind > concerning these duplicates. > > Thoughts?
"bleh"? ;-) however I'm not sure I have a better idea, but here are some other options. - have lang_hooks_decl_printable_name return something like auto_ptr<char *> so you'd write fprintf ("blah: %s", hooks.printable_name (t).get ()); - depending on who all calls this function we could also replace the static cache with a hash map from trees to strings. Trev > Dave >