Re: Fix 'hash_table::expand' to destruct stale Value objects

Martin Sebor via Gcc Wed, 01 Sep 2021 18:32:41 -0700

On 8/30/21 4:46 AM, Thomas Schwinge wrote:

Hi!


Ping -- we still need to plug the memory leak; see patch attached, and/or
long discussion here:


Thanks for answering my questions.  I have no concerns with going
forward with the patch as is.  Just a suggestion/request: unless
this patch fixes all the outstanding problems you know of or suspect
in this area (leaks/missing dtor calls) and unless you plan to work
on those in the near future, please open a bug for them with a brain
dump of what you learned.  That should save us time when the day
comes to tackle those.

Martin


On 2021-08-16T14:10:00-0600, Martin Sebor <mse...@gmail.com> wrote:

On 8/16/21 6:44 AM, Thomas Schwinge wrote:

On 2021-08-12T17:15:44-0600, Martin Sebor via Gcc <gcc@gcc.gnu.org> wrote:

On 8/6/21 10:57 AM, Thomas Schwinge wrote:

So I'm trying to do some C++...  ;-)

Given:

       /* A map from SSA names or var decls to record fields.  */
       typedef hash_map<tree, tree> field_map_t;

       /* For each propagation record type, this is a map from SSA names or var 
decls
          to propagate, to the field in the record type that should be used for
          transmission and reception.  */
       typedef hash_map<tree, field_map_t> record_field_map_t;

Thus, that's a 'hash_map<tree, hash_map<tree, tree>>'.  (I may do that,
right?)  Looking through GCC implementation files, very most of all uses
of 'hash_map' boil down to pointer key ('tree', for example) and
pointer/integer value.


Right.  Because most GCC containers rely exclusively on GCC's own
uses for testing, if your use case is novel in some way, chances
are it might not work as intended in all circumstances.

I've wrestled with hash_map a number of times.  A use case that's
close to yours (i.e., a non-trivial value type) is in cp/parser.c:
see class_to_loc_map_t.


Indeed, at the time you sent this email, I already had started looking
into that one!  (The Fortran test cases that I originally analyzed, which
triggered other cases of non-POD/non-trivial destructor, all didn't
result in a memory leak, because the non-trivial constructor doesn't
actually allocate any resources dynamically -- that's indeed different in
this case here.)  ..., and indeed:

(I don't remember if I tested it for leaks
though.  It's used to implement -Wmismatched-tags so compiling
a few tests under Valgrind should show if it does leak.)


... it does leak memory at present.  :-| (See attached commit log for
details for one example.)


(Attached "Fix 'hash_table::expand' to destruct stale Value objects"
again.)

To that effect, to document the current behavior, I propose to
"Add more self-tests for 'hash_map' with Value type with non-trivial
constructor/destructor"


(We've done that in commit e4f16e9f357a38ec702fb69a0ffab9d292a6af9b
"Add more self-tests for 'hash_map' with Value type with non-trivial
constructor/destructor", quickly followed by bug fix
commit bb04a03c6f9bacc890118b9e12b657503093c2f8
"Make 'gcc/hash-map-tests.c:test_map_of_type_with_ctor_and_dtor_expand'
work on 32-bit architectures [PR101959]".

(Also cherry-pick into release branches, eventually?)

Then:

       record_field_map_t field_map ([...]); // see below
       for ([...])
         {
           tree record_type = [...];
           [...]
           bool existed;
           field_map_t &fields
             = field_map.get_or_insert (record_type, &existed);
           gcc_checking_assert (!existed);
           [...]
           for ([...])
             fields.put ([...], [...]);
           [...]
         }
       [stuff that looks up elements from 'field_map']
       field_map.empty ();

This generally works.

If I instantiate 'record_field_map_t field_map (40);', Valgrind is happy.
If however I instantiate 'record_field_map_t field_map (13);' (where '13'
would be the default for 'hash_map'), Valgrind complains:

       2,080 bytes in 10 blocks are definitely lost in loss record 828 of 876
          at 0x483DD99: calloc (vg_replace_malloc.c:762)
          by 0x175F010: xcalloc (xmalloc.c:162)
          by 0xAF4A2C: hash_table<hash_map<tree_node*, tree_node*, 
simple_hashmap_traits<default_hash_traits<tree_node*>, tree_node*> >::hash_entry, false, 
xcallocator>::hash_table(unsigned long, bool, bool, bool, mem_alloc_origin) (hash-table.h:275)
          by 0x15E0120: hash_map<tree_node*, tree_node*, 
simple_hashmap_traits<default_hash_traits<tree_node*>, tree_node*> 
>::hash_map(unsigned long, bool, bool, bool) (hash-map.h:143)
          by 0x15DEE87: hash_map<tree_node*, hash_map<tree_node*, tree_node*, 
simple_hashmap_traits<default_hash_traits<tree_node*>, tree_node*> >, 
simple_hashmap_traits<default_hash_traits<tree_node*>, hash_map<tree_node*, tree_node*, 
simple_hashmap_traits<default_hash_traits<tree_node*>, tree_node*> > > >::get_or_insert(tree_node* const&, 
bool*) (hash-map.h:205)
          by 0x15DD52C: execute_omp_oacc_neuter_broadcast() 
(omp-oacc-neuter-broadcast.cc:1371)
          [...]

(That's with '#pragma GCC optimize "O0"' at the top of the 'gcc/*.cc'
file.)

My suspicion was that it is due to the 'field_map' getting resized as it
incrementally grows (and '40' being big enough for that to never happen),
and somehow the non-POD (?) value objects not being properly handled
during that.  Working my way a bit through 'gcc/hash-map.*' and
'gcc/hash-table.*' (but not claiming that I understand all that, off
hand), it seems as if my theory is right: I'm able to plug this memory
leak as follows:

       --- gcc/hash-table.h
       +++ gcc/hash-table.h
       @@ -820,6 +820,8 @@ hash_table<Descriptor, Lazy, Allocator>::expand ()
                {
                  value_type *q = find_empty_slot_for_expand (Descriptor::hash 
(x));
             new ((void*) q) value_type (std::move (x));
       +     //BAD Descriptor::remove (x); // (doesn't make sense and) a ton of 
"Invalid read [...] inside a block of size [...] free'd"
       +     x.~value_type (); //GOOD This seems to work!  -- but does it make 
sense?
                }

              p++;

However, that doesn't exactly look like a correct fix, does it?  I'd
expect such a manual destructor call in combination with placement new
(that is being used here, obviously) -- but this is after 'std::move'?
However, this also survives a smoke-test-like run of parts of the GCC
testsuite, bootstrap and complete run now ongoing.


If explicitly calling the dtor on the moved object is the right thing
to do I'd expect to see such invocations elsewhere in hash_table but
I don't.  It does seem like removed elements ought to be destroyed,
but it also seems like the destruction should go through some traits
class (e.g., call Descriptor::remove and mark_deleted or do some
similar dance), and be called from other member functions that move
elements.


So, we shall assume that any "real C++" use case (that is, C++ class) is
using the appropriate C++ method, that is, 'typed_delete_remove', which
does 'delete', which does destroy the C++ object, if applicable, else
'typed_noop_remove'.

(Shall we look into the few places that use 'typed_free_remove' via
'free_ptr_hash', and potentially replace them with 'typed_delete_remove'?
Or is there a reason for the two schemes to co-exist, other than for
legacy reasons?  Anyway, that's for another day.)


I find all these these traits pretty much impenetrable.


(Indeed.  ... which triggered this reflex with me, to try simplifying
this by getting rid of 'typed_free_remove' etc...)

I guess
intuitively, I'd expect Descriptor::remove() to destroy an element,
but I have no idea if that would be right given the design.


So 'typed_delete_remove' does destruct via 'delete'.  'typed_free_remove'
doesn't -- but is only used via 'free_ptr_hash', where this isn't
relevant?  'typed_noop_remove' I haven't considered yet regarding that
issue.  Anyway, that's all for another day.

What is different in the 'hash_table::expand' case is that all the Value
objects do get 'std::move'd into a new blob of memory via placement new
(so a subsequent 'delete' via 'typed_delete_remove' is not appropriate),
but then the stale Value objects never get destructed.  And indeed an
explicit destructor call (which, as I understand is a no-op for primitive
types; "pseudo destructor") is the right thing to do; see
<https://stackoverflow.com/questions/6730403/how-to-delete-object-constructed-via-placement-new-operator>
and others, for example.  (Therefore, I don't think this needs to be
routed through a "traits" function, but can rather be done in-line here,
after each placement new, before deallocation of the original blob of
memory.  Also, I argue it's the right thing to do also for 'm_ggc',
because even if in that case we're not going to leak memory (GC will
reclaim), but we still may leak other resources dynamically allocated in
a non-trivial constructor.)


Yes, of course, all elements need to be eventually be destroyed.
My only hesitation was whether it should be done via one of these
traits classes (like it's done in C++ containers via allocators)
rather than directly


Given that there is (apparently) no issue to do a placement new in
'hash_table::expand', I'd asumme a -- corresponding -- explicit
destructor call would be likewise appropriate?  (But I'll of course route
this through a (new) "traits" function if someone explains why this is
the right thing to do.)

and whether there might be other places
where the destruction night need to happen.


(Possibly, yes, per discussion above -- but that's a separate issue?)

The attached "Fix 'hash_table::expand' to destruct stale Value objects"
does prevent my original problem, does address the current 'class2loc'
memory leak in 'cp/parser.c' (see commit log for one example), and
adjusts the new
'gcc/hash-map-tests.c:test_map_of_type_with_ctor_and_dtor_expand' test
case such that number of constructor calls matches the number of
destructor calls.  After some careful review regarding C++ details
(please!), OK to push to master branch?  (Also cherry-pick into release
branches, eventually?)  Is the source code comment that I'm adding
sufficiently explanatory and correct in C++ terms?


Ping.

Shouldn't the hash_table dtor (and perhaps also empty())  also
destroy the elements?  (Otherwise, what destroys the elements
newly constructed here, or anywhere else for that matter, such
as in the hash_table ctor?)


Per my understanding, per discussion above, they (eventually) do get
destroyed via 'delete' in 'typed_delete_remove', for example, via
'Descriptor::remove' calls for all/relevant entries in
'hash_table::~hash_table', 'hash_table::empty_slow',
'hash_table::clear_slot', 'hash_table::remove_elt_with_hash'.

(This means that if there has been an intermediate 'expand', this may
(eventually) destroy objects at a different memory location from where
they originally have been created -- but that's fine.)

The 'expand' case itself is different: any (live) entries are *moved*
into a new storage memory object via placement new.  (And then the
hollow entries in the old storage memory object linger.)

Also, shouldn't the destroyed slot be marked either deleted or
empty?)


Per my understanding, irrelevant in 'expand': working through 'oentries',
the *move* is only done 'if (!is_empty (x) && !is_deleted (x))' (so
combined with the item above, there is no memory leak for any entries
that have already been 'remove'd -- they have already been destructed),
and the whole (old) storage memory object will be deallocated right after
the 'oentries' loop.

(Sorry, I realize I have more questions than answers.)


No worries, happens to me most of the times!  Thanks for looking into
this.


Grüße
  Thomas


-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955

Re: Fix 'hash_table::expand' to destruct stale Value objects

Reply via email to