On Fri, 6 Aug 2021, 17:58 Thomas Schwinge, <tho...@codesourcery.com> wrote:
> Hi! > > So I'm trying to do some C++... ;-) > > Given: > > /* A map from SSA names or var decls to record fields. */ > typedef hash_map<tree, tree> field_map_t; > > /* For each propagation record type, this is a map from SSA names or > var decls > to propagate, to the field in the record type that should be used > for > transmission and reception. */ > typedef hash_map<tree, field_map_t> record_field_map_t; > > Thus, that's a 'hash_map<tree, hash_map<tree, tree>>'. (I may do that, > right?) Looking through GCC implementation files, very most of all uses > of 'hash_map' boil down to pointer key ('tree', for example) and > pointer/integer value. > > Then: > > record_field_map_t field_map ([...]); // see below > for ([...]) > { > tree record_type = [...]; > [...] > bool existed; > field_map_t &fields > = field_map.get_or_insert (record_type, &existed); > gcc_checking_assert (!existed); > [...] > for ([...]) > fields.put ([...], [...]); > [...] > } > [stuff that looks up elements from 'field_map'] > field_map.empty (); > > This generally works. > > If I instantiate 'record_field_map_t field_map (40);', Valgrind is happy. > If however I instantiate 'record_field_map_t field_map (13);' (where '13' > would be the default for 'hash_map'), Valgrind complains: > > 2,080 bytes in 10 blocks are definitely lost in loss record 828 of 876 > at 0x483DD99: calloc (vg_replace_malloc.c:762) > by 0x175F010: xcalloc (xmalloc.c:162) > by 0xAF4A2C: hash_table<hash_map<tree_node*, tree_node*, > simple_hashmap_traits<default_hash_traits<tree_node*>, tree_node*> > >::hash_entry, false, xcallocator>::hash_table(unsigned long, bool, bool, > bool, mem_alloc_origin) (hash-table.h:275) > by 0x15E0120: hash_map<tree_node*, tree_node*, > simple_hashmap_traits<default_hash_traits<tree_node*>, tree_node*> > >::hash_map(unsigned long, bool, bool, bool) (hash-map.h:143) > by 0x15DEE87: hash_map<tree_node*, hash_map<tree_node*, tree_node*, > simple_hashmap_traits<default_hash_traits<tree_node*>, tree_node*> >, > simple_hashmap_traits<default_hash_traits<tree_node*>, hash_map<tree_node*, > tree_node*, simple_hashmap_traits<default_hash_traits<tree_node*>, > tree_node*> > > >::get_or_insert(tree_node* const&, bool*) (hash-map.h:205) > by 0x15DD52C: execute_omp_oacc_neuter_broadcast() > (omp-oacc-neuter-broadcast.cc:1371) > [...] > > (That's with '#pragma GCC optimize "O0"' at the top of the 'gcc/*.cc' > file.) > > My suspicion was that it is due to the 'field_map' getting resized as it > incrementally grows (and '40' being big enough for that to never happen), > and somehow the non-POD (?) value objects not being properly handled > during that. Working my way a bit through 'gcc/hash-map.*' and > 'gcc/hash-table.*' (but not claiming that I understand all that, off > hand), it seems as if my theory is right: I'm able to plug this memory > leak as follows: > > --- gcc/hash-table.h > +++ gcc/hash-table.h > @@ -820,6 +820,8 @@ hash_table<Descriptor, Lazy, Allocator>::expand () > { > value_type *q = find_empty_slot_for_expand > (Descriptor::hash (x)); > new ((void*) q) value_type (std::move (x)); > + //BAD Descriptor::remove (x); // (doesn't make sense and) a ton > of "Invalid read [...] inside a block of size [...] free'd" > + x.~value_type (); //GOOD This seems to work! -- but does it > make sense? > } > > p++; > > However, that doesn't exactly look like a correct fix, does it? I'd > expect such a manual destructor call in combination with placement new > (that is being used here, obviously) -- but this is after 'std::move'? > However, this also survives a smoke-test-like run of parts of the GCC > testsuite, bootstrap and complete run now ongoing. > Does GCC's hash_map assume you only use it to store POD (plain old data) types, which don't need to be destroyed, because they don't have any dynamically allocated memory or other resources? A hash_map is not a POD, because it does have dynamically allocated memory. If my guess is right, then hash_map should really use a static_assert to enforce that requirement, instead of letting you use it in a way that will leak.