Re: [PING^2] Re: Fix 'hash_table::expand' to destruct stale Value objects

Richard Biener via Gcc Fri, 17 Sep 2021 05:10:23 -0700

On Fri, Sep 17, 2021 at 1:17 PM Thomas Schwinge <[email protected]> wrote:
>
> Hi!
>
> On 2021-09-10T10:00:25+0200, I wrote:
> > On 2021-09-01T19:31:19-0600, Martin Sebor via Gcc-patches 
> > <[email protected]> wrote:
> >> On 8/30/21 4:46 AM, Thomas Schwinge wrote:
> >>> Ping -- we still need to plug the memory leak; see patch attached, and/or
> >>> long discussion here:
> >>
> >> Thanks for answering my questions.  I have no concerns with going
> >> forward with the patch as is.
> >
> > Thanks, Martin.  Ping for formal approval (and review for using proper
> > C++ terminology in the 'gcc/hash-table.h:hash_table::expand' source code
> > comment that I'm adding).  Patch again attached, for easy reference.
>
> Ping, once again.


I'm happy when a C++ literate approves the main change which I quote as

          new ((void*) q) value_type (std::move (x));
+
+         /* Manually invoke destructor of original object, to counterbalance
+            object constructed via placement new.  */
+         x.~value_type ();

but I had the impression that std::move already "moves away" from the source?
That said, the dance above looks iffy, there must be a nicer way to "move"
an object in C++?

What happens if the dtor is deleted btw?  Shouldn't you use sth
like a placement 'delete' instead of invoking a DTOR?

But the above clearly shows I know nothing of C++ :P

Richard.

>
> Grüße
>  Thomas
>
>
> >> Just a suggestion/request: unless
> >> this patch fixes all the outstanding problems you know of or suspect
> >> in this area (leaks/missing dtor calls) and unless you plan to work
> >> on those in the near future, please open a bug for them with a brain
> >> dump of what you learned.  That should save us time when the day
> >> comes to tackle those.
> >
> > ACK.  I'm not aware of any additional known problems.  (In our email
> > discussion, we did have some "vague ideas" for opportunities of
> > clarification/clean-up, but these aren't worth filing PRs for; needs
> > someone to gain understanding, taking a look.)
> >
> >
> > Grüße
> >  Thomas
> >
> >
> >>> On 2021-08-16T14:10:00-0600, Martin Sebor <[email protected]> wrote:
> >>>> On 8/16/21 6:44 AM, Thomas Schwinge wrote:
> >>>>> On 2021-08-12T17:15:44-0600, Martin Sebor via Gcc <[email protected]> 
> >>>>> wrote:
> >>>>>> On 8/6/21 10:57 AM, Thomas Schwinge wrote:
> >>>>>>> So I'm trying to do some C++...  ;-)
> >>>>>>>
> >>>>>>> Given:
> >>>>>>>
> >>>>>>>        /* A map from SSA names or var decls to record fields.  */
> >>>>>>>        typedef hash_map<tree, tree> field_map_t;
> >>>>>>>
> >>>>>>>        /* For each propagation record type, this is a map from SSA 
> >>>>>>> names or var decls
> >>>>>>>           to propagate, to the field in the record type that should 
> >>>>>>> be used for
> >>>>>>>           transmission and reception.  */
> >>>>>>>        typedef hash_map<tree, field_map_t> record_field_map_t;
> >>>>>>>
> >>>>>>> Thus, that's a 'hash_map<tree, hash_map<tree, tree>>'.  (I may do 
> >>>>>>> that,
> >>>>>>> right?)  Looking through GCC implementation files, very most of all 
> >>>>>>> uses
> >>>>>>> of 'hash_map' boil down to pointer key ('tree', for example) and
> >>>>>>> pointer/integer value.
> >>>>>>
> >>>>>> Right.  Because most GCC containers rely exclusively on GCC's own
> >>>>>> uses for testing, if your use case is novel in some way, chances
> >>>>>> are it might not work as intended in all circumstances.
> >>>>>>
> >>>>>> I've wrestled with hash_map a number of times.  A use case that's
> >>>>>> close to yours (i.e., a non-trivial value type) is in cp/parser.c:
> >>>>>> see class_to_loc_map_t.
> >>>>>
> >>>>> Indeed, at the time you sent this email, I already had started looking
> >>>>> into that one!  (The Fortran test cases that I originally analyzed, 
> >>>>> which
> >>>>> triggered other cases of non-POD/non-trivial destructor, all didn't
> >>>>> result in a memory leak, because the non-trivial constructor doesn't
> >>>>> actually allocate any resources dynamically -- that's indeed different 
> >>>>> in
> >>>>> this case here.)  ..., and indeed:
> >>>>>
> >>>>>> (I don't remember if I tested it for leaks
> >>>>>> though.  It's used to implement -Wmismatched-tags so compiling
> >>>>>> a few tests under Valgrind should show if it does leak.)
> >>>>>
> >>>>> ... it does leak memory at present.  :-| (See attached commit log for
> >>>>> details for one example.)
> >>>
> >>> (Attached "Fix 'hash_table::expand' to destruct stale Value objects"
> >>> again.)
> >>>
> >>>>> To that effect, to document the current behavior, I propose to
> >>>>> "Add more self-tests for 'hash_map' with Value type with non-trivial
> >>>>> constructor/destructor"
> >>>
> >>> (We've done that in commit e4f16e9f357a38ec702fb69a0ffab9d292a6af9b
> >>> "Add more self-tests for 'hash_map' with Value type with non-trivial
> >>> constructor/destructor", quickly followed by bug fix
> >>> commit bb04a03c6f9bacc890118b9e12b657503093c2f8
> >>> "Make 'gcc/hash-map-tests.c:test_map_of_type_with_ctor_and_dtor_expand'
> >>> work on 32-bit architectures [PR101959]".
> >>>
> >>>>> (Also cherry-pick into release branches, eventually?)
> >>>
> >>>>>>> Then:
> >>>>>>>
> >>>>>>>        record_field_map_t field_map ([...]); // see below
> >>>>>>>        for ([...])
> >>>>>>>          {
> >>>>>>>            tree record_type = [...];
> >>>>>>>            [...]
> >>>>>>>            bool existed;
> >>>>>>>            field_map_t &fields
> >>>>>>>              = field_map.get_or_insert (record_type, &existed);
> >>>>>>>            gcc_checking_assert (!existed);
> >>>>>>>            [...]
> >>>>>>>            for ([...])
> >>>>>>>              fields.put ([...], [...]);
> >>>>>>>            [...]
> >>>>>>>          }
> >>>>>>>        [stuff that looks up elements from 'field_map']
> >>>>>>>        field_map.empty ();
> >>>>>>>
> >>>>>>> This generally works.
> >>>>>>>
> >>>>>>> If I instantiate 'record_field_map_t field_map (40);', Valgrind is 
> >>>>>>> happy.
> >>>>>>> If however I instantiate 'record_field_map_t field_map (13);' (where 
> >>>>>>> '13'
> >>>>>>> would be the default for 'hash_map'), Valgrind complains:
> >>>>>>>
> >>>>>>>        2,080 bytes in 10 blocks are definitely lost in loss record 
> >>>>>>> 828 of 876
> >>>>>>>           at 0x483DD99: calloc (vg_replace_malloc.c:762)
> >>>>>>>           by 0x175F010: xcalloc (xmalloc.c:162)
> >>>>>>>           by 0xAF4A2C: hash_table<hash_map<tree_node*, tree_node*, 
> >>>>>>> simple_hashmap_traits<default_hash_traits<tree_node*>, tree_node*> 
> >>>>>>> >::hash_entry, false, xcallocator>::hash_table(unsigned long, bool, 
> >>>>>>> bool, bool, mem_alloc_origin) (hash-table.h:275)
> >>>>>>>           by 0x15E0120: hash_map<tree_node*, tree_node*, 
> >>>>>>> simple_hashmap_traits<default_hash_traits<tree_node*>, tree_node*> 
> >>>>>>> >::hash_map(unsigned long, bool, bool, bool) (hash-map.h:143)
> >>>>>>>           by 0x15DEE87: hash_map<tree_node*, hash_map<tree_node*, 
> >>>>>>> tree_node*, simple_hashmap_traits<default_hash_traits<tree_node*>, 
> >>>>>>> tree_node*> >, simple_hashmap_traits<default_hash_traits<tree_node*>, 
> >>>>>>> hash_map<tree_node*, tree_node*, 
> >>>>>>> simple_hashmap_traits<default_hash_traits<tree_node*>, tree_node*> > 
> >>>>>>> > >::get_or_insert(tree_node* const&, bool*) (hash-map.h:205)
> >>>>>>>           by 0x15DD52C: execute_omp_oacc_neuter_broadcast() 
> >>>>>>> (omp-oacc-neuter-broadcast.cc:1371)
> >>>>>>>           [...]
> >>>>>>>
> >>>>>>> (That's with '#pragma GCC optimize "O0"' at the top of the 'gcc/*.cc'
> >>>>>>> file.)
> >>>>>>>
> >>>>>>> My suspicion was that it is due to the 'field_map' getting resized as 
> >>>>>>> it
> >>>>>>> incrementally grows (and '40' being big enough for that to never 
> >>>>>>> happen),
> >>>>>>> and somehow the non-POD (?) value objects not being properly handled
> >>>>>>> during that.  Working my way a bit through 'gcc/hash-map.*' and
> >>>>>>> 'gcc/hash-table.*' (but not claiming that I understand all that, off
> >>>>>>> hand), it seems as if my theory is right: I'm able to plug this memory
> >>>>>>> leak as follows:
> >>>>>>>
> >>>>>>>        --- gcc/hash-table.h
> >>>>>>>        +++ gcc/hash-table.h
> >>>>>>>        @@ -820,6 +820,8 @@ hash_table<Descriptor, Lazy, 
> >>>>>>> Allocator>::expand ()
> >>>>>>>                 {
> >>>>>>>                   value_type *q = find_empty_slot_for_expand 
> >>>>>>> (Descriptor::hash (x));
> >>>>>>>              new ((void*) q) value_type (std::move (x));
> >>>>>>>        +     //BAD Descriptor::remove (x); // (doesn't make sense 
> >>>>>>> and) a ton of "Invalid read [...] inside a block of size [...] free'd"
> >>>>>>>        +     x.~value_type (); //GOOD This seems to work!  -- but 
> >>>>>>> does it make sense?
> >>>>>>>                 }
> >>>>>>>
> >>>>>>>               p++;
> >>>>>>>
> >>>>>>> However, that doesn't exactly look like a correct fix, does it?  I'd
> >>>>>>> expect such a manual destructor call in combination with placement new
> >>>>>>> (that is being used here, obviously) -- but this is after 'std::move'?
> >>>>>>> However, this also survives a smoke-test-like run of parts of the GCC
> >>>>>>> testsuite, bootstrap and complete run now ongoing.
> >>>>>>
> >>>>>> If explicitly calling the dtor on the moved object is the right thing
> >>>>>> to do I'd expect to see such invocations elsewhere in hash_table but
> >>>>>> I don't.  It does seem like removed elements ought to be destroyed,
> >>>>>> but it also seems like the destruction should go through some traits
> >>>>>> class (e.g., call Descriptor::remove and mark_deleted or do some
> >>>>>> similar dance), and be called from other member functions that move
> >>>>>> elements.
> >>>>>
> >>>>> So, we shall assume that any "real C++" use case (that is, C++ class) is
> >>>>> using the appropriate C++ method, that is, 'typed_delete_remove', which
> >>>>> does 'delete', which does destroy the C++ object, if applicable, else
> >>>>> 'typed_noop_remove'.
> >>>>>
> >>>>> (Shall we look into the few places that use 'typed_free_remove' via
> >>>>> 'free_ptr_hash', and potentially replace them with 
> >>>>> 'typed_delete_remove'?
> >>>>> Or is there a reason for the two schemes to co-exist, other than for
> >>>>> legacy reasons?  Anyway, that's for another day.)
> >>>>
> >>>> I find all these these traits pretty much impenetrable.
> >>>
> >>> (Indeed.  ... which triggered this reflex with me, to try simplifying
> >>> this by getting rid of 'typed_free_remove' etc...)
> >>>
> >>>> I guess
> >>>> intuitively, I'd expect Descriptor::remove() to destroy an element,
> >>>> but I have no idea if that would be right given the design.
> >>>
> >>> So 'typed_delete_remove' does destruct via 'delete'.  'typed_free_remove'
> >>> doesn't -- but is only used via 'free_ptr_hash', where this isn't
> >>> relevant?  'typed_noop_remove' I haven't considered yet regarding that
> >>> issue.  Anyway, that's all for another day.
> >>>
> >>>>> What is different in the 'hash_table::expand' case is that all the Value
> >>>>> objects do get 'std::move'd into a new blob of memory via placement new
> >>>>> (so a subsequent 'delete' via 'typed_delete_remove' is not appropriate),
> >>>>> but then the stale Value objects never get destructed.  And indeed an
> >>>>> explicit destructor call (which, as I understand is a no-op for 
> >>>>> primitive
> >>>>> types; "pseudo destructor") is the right thing to do; see
> >>>>> <https://stackoverflow.com/questions/6730403/how-to-delete-object-constructed-via-placement-new-operator>
> >>>>> and others, for example.  (Therefore, I don't think this needs to be
> >>>>> routed through a "traits" function, but can rather be done in-line here,
> >>>>> after each placement new, before deallocation of the original blob of
> >>>>> memory.  Also, I argue it's the right thing to do also for 'm_ggc',
> >>>>> because even if in that case we're not going to leak memory (GC will
> >>>>> reclaim), but we still may leak other resources dynamically allocated in
> >>>>> a non-trivial constructor.)
> >>>>
> >>>> Yes, of course, all elements need to be eventually be destroyed.
> >>>> My only hesitation was whether it should be done via one of these
> >>>> traits classes (like it's done in C++ containers via allocators)
> >>>> rather than directly
> >>>
> >>> Given that there is (apparently) no issue to do a placement new in
> >>> 'hash_table::expand', I'd asumme a -- corresponding -- explicit
> >>> destructor call would be likewise appropriate?  (But I'll of course route
> >>> this through a (new) "traits" function if someone explains why this is
> >>> the right thing to do.)
> >>>
> >>>> and whether there might be other places
> >>>> where the destruction night need to happen.
> >>>
> >>> (Possibly, yes, per discussion above -- but that's a separate issue?)
> >>>
> >>>>> The attached "Fix 'hash_table::expand' to destruct stale Value objects"
> >>>>> does prevent my original problem, does address the current 'class2loc'
> >>>>> memory leak in 'cp/parser.c' (see commit log for one example), and
> >>>>> adjusts the new
> >>>>> 'gcc/hash-map-tests.c:test_map_of_type_with_ctor_and_dtor_expand' test
> >>>>> case such that number of constructor calls matches the number of
> >>>>> destructor calls.  After some careful review regarding C++ details
> >>>>> (please!), OK to push to master branch?  (Also cherry-pick into release
> >>>>> branches, eventually?)  Is the source code comment that I'm adding
> >>>>> sufficiently explanatory and correct in C++ terms?
> >>>
> >>> Ping.
> >>>
> >>>> Shouldn't the hash_table dtor (and perhaps also empty())  also
> >>>> destroy the elements?  (Otherwise, what destroys the elements
> >>>> newly constructed here, or anywhere else for that matter, such
> >>>> as in the hash_table ctor?)
> >>>
> >>> Per my understanding, per discussion above, they (eventually) do get
> >>> destroyed via 'delete' in 'typed_delete_remove', for example, via
> >>> 'Descriptor::remove' calls for all/relevant entries in
> >>> 'hash_table::~hash_table', 'hash_table::empty_slow',
> >>> 'hash_table::clear_slot', 'hash_table::remove_elt_with_hash'.
> >>>
> >>> (This means that if there has been an intermediate 'expand', this may
> >>> (eventually) destroy objects at a different memory location from where
> >>> they originally have been created -- but that's fine.)
> >>>
> >>> The 'expand' case itself is different: any (live) entries are *moved*
> >>> into a new storage memory object via placement new.  (And then the
> >>> hollow entries in the old storage memory object linger.)
> >>>
> >>>> Also, shouldn't the destroyed slot be marked either deleted or
> >>>> empty?)
> >>>
> >>> Per my understanding, irrelevant in 'expand': working through 'oentries',
> >>> the *move* is only done 'if (!is_empty (x) && !is_deleted (x))' (so
> >>> combined with the item above, there is no memory leak for any entries
> >>> that have already been 'remove'd -- they have already been destructed),
> >>> and the whole (old) storage memory object will be deallocated right after
> >>> the 'oentries' loop.
> >>>
> >>>
> >>>> (Sorry, I realize I have more questions than answers.)
> >>>
> >>> No worries, happens to me most of the times!  Thanks for looking into
> >>> this.
> >>>
> >>>
> >>> Grüße
> >>>   Thomas
>
>
> -----------------
> Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
> München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
> Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
> München, HRB 106955

Re: [PING^2] Re: Fix 'hash_table::expand' to destruct stale Value objects

Reply via email to