Hi!
So I'm trying to do some C++... ;-)
Given:
/* A map from SSA names or var decls to record fields. */
typedef hash_map<tree, tree> field_map_t;
/* For each propagation record type, this is a map from SSA names or var
decls
to propagate, to the field in the record type that should be used for
transmission and reception. */
typedef hash_map<tree, field_map_t> record_field_map_t;
Thus, that's a 'hash_map<tree, hash_map<tree, tree>>'. (I may do that,
right?) Looking through GCC implementation files, very most of all uses
of 'hash_map' boil down to pointer key ('tree', for example) and
pointer/integer value.
Then:
record_field_map_t field_map ([...]); // see below
for ([...])
{
tree record_type = [...];
[...]
bool existed;
field_map_t &fields
= field_map.get_or_insert (record_type, &existed);
gcc_checking_assert (!existed);
[...]
for ([...])
fields.put ([...], [...]);
[...]
}
[stuff that looks up elements from 'field_map']
field_map.empty ();
This generally works.
If I instantiate 'record_field_map_t field_map (40);', Valgrind is happy.
If however I instantiate 'record_field_map_t field_map (13);' (where '13'
would be the default for 'hash_map'), Valgrind complains:
2,080 bytes in 10 blocks are definitely lost in loss record 828 of 876
at 0x483DD99: calloc (vg_replace_malloc.c:762)
by 0x175F010: xcalloc (xmalloc.c:162)
by 0xAF4A2C: hash_table<hash_map<tree_node*, tree_node*,
simple_hashmap_traits<default_hash_traits<tree_node*>, tree_node*>
>::hash_entry, false, xcallocator>::hash_table(unsigned long, bool, bool, bool,
mem_alloc_origin) (hash-table.h:275)
by 0x15E0120: hash_map<tree_node*, tree_node*,
simple_hashmap_traits<default_hash_traits<tree_node*>, tree_node*>
>::hash_map(unsigned long, bool, bool, bool) (hash-map.h:143)
by 0x15DEE87: hash_map<tree_node*, hash_map<tree_node*, tree_node*,
simple_hashmap_traits<default_hash_traits<tree_node*>, tree_node*> >,
simple_hashmap_traits<default_hash_traits<tree_node*>, hash_map<tree_node*,
tree_node*, simple_hashmap_traits<default_hash_traits<tree_node*>, tree_node*>
> > >::get_or_insert(tree_node* const&, bool*) (hash-map.h:205)
by 0x15DD52C: execute_omp_oacc_neuter_broadcast()
(omp-oacc-neuter-broadcast.cc:1371)
[...]
(That's with '#pragma GCC optimize "O0"' at the top of the 'gcc/*.cc'
file.)
My suspicion was that it is due to the 'field_map' getting resized as it
incrementally grows (and '40' being big enough for that to never happen),
and somehow the non-POD (?) value objects not being properly handled
during that. Working my way a bit through 'gcc/hash-map.*' and
'gcc/hash-table.*' (but not claiming that I understand all that, off
hand), it seems as if my theory is right: I'm able to plug this memory
leak as follows:
--- gcc/hash-table.h
+++ gcc/hash-table.h
@@ -820,6 +820,8 @@ hash_table<Descriptor, Lazy, Allocator>::expand ()
{
value_type *q = find_empty_slot_for_expand (Descriptor::hash
(x));
new ((void*) q) value_type (std::move (x));
+ //BAD Descriptor::remove (x); // (doesn't make sense and) a ton of
"Invalid read [...] inside a block of size [...] free'd"
+ x.~value_type (); //GOOD This seems to work! -- but does it make
sense?
}
p++;
However, that doesn't exactly look like a correct fix, does it? I'd
expect such a manual destructor call in combination with placement new
(that is being used here, obviously) -- but this is after 'std::move'?
However, this also survives a smoke-test-like run of parts of the GCC
testsuite, bootstrap and complete run now ongoing.
Grüße
Thomas
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht
München, HRB 106955