When reviewing Mikhail's patch, I objected to another local "traits for hashing a string" class. But when I looked into it, I realised it really is quite hard to unify the current hash traits.
At the moment we need separate traits classes for hash_table, hash_set and hash_map. I think that's a sign that we don't have the right abstraction. The aim of this series is to unify the traits for hash_table and hash_set. If that's OK, I'll look at adding default hash_map traits for the common case in which the deleted/empty information is stored in the key rather than the value and where hash_table traits could be used. There were various things I needed to change along the way: - Our one generic hasher, pointer_hash, inherits from typed_noop_remove. In other words, it's specifically for cases where the pointer isn't freed when an entry is deleted. That seems a bit arbitrary. I think we should separate out the hashing and the removing aspects. Also, typed_noop_remove and typed_free_remove don't provide hashing typedefs, but the ggc equivalents ggc_hasher and ggc_cache_hasher do. If we separate the hashing and removing aspects, the typedefs belong to the hashing side. This gives: "Hashers" - the only generic one is pointer_hash, but I'll follow up with a string one if the series is OK "Removers" - typed_noop_remove (as it exists now) - typed_free_remove (as it exists now) - ggc_remove (renamed from ggc_hasher, but without the typedefs) - ggc_cache_remove (renamed from ggc_cache_hasher, but without the typedefs) These are then combined to give: - nofree_ptr_hash - free_ptr_hash - ggc_ptr_hash - ggc_cache_ptr_hash For this particular case, it might be cleaner to use a vec<>-style parameter for the allocator (really just freer), e.g. hash_table<foo, da_gc>, hash_table<foo, da_heap> I did try that, but it felt unnatural in the cases where the remove() function frees some subobject as well. Either you have a separate "allocator" that frees both the subdata and the entry itself, or you have a separate (hasher?) function for freeing subdata. Much of the awkwardness here is due to not having "proper" object construction and destruction. If we had that, the memory management would be an integral part of the element type rather than being part of the traits. So I don't think adding a template parameter for the element remover would necessarily be forward progress. Although we end up with four classes for pointers, and could have at least three for strings, most other types have only one. - The traits have optional is_deleted, is_empty, mark_deleted and mark_empty methods, with a default behaviour suitable for pointer elements. The methods were probably optional because many pointer traits classes were implemented independently, but after the change above, they can instead inherit from pointer_hash (or from an allocator-specific derivation). It then makes sense to put the handling of empty and deleted pointer entries in pointer_hash and remove the complicated code for handling defaults. - handle_cache_entry is too tied to the internals. The generic implementation is: static void handle_cache_entry (T &e) { if (e != HTAB_EMPTY_ENTRY && e != HTAB_DELETED_ENTRY && !ggc_marked_p (e)) e = static_cast<T> (HTAB_DELETED_ENTRY); } and a typical override looks like: void pad_type_hasher::handle_cache_entry (pad_type_hash *&t) { extern void gt_ggc_mx (pad_type_hash *&); if (t == HTAB_EMPTY_ENTRY || t == HTAB_DELETED_ENTRY) return; else if (ggc_marked_p (t->type)) gt_ggc_mx (t); else t = static_cast<pad_type_hash *> (HTAB_DELETED_ENTRY); } In particular, this bypasses the normal remove() function, so we never update m_n_deleted. ISTM the (single) caller should be checking for empty and deleted items first and handling any deletion itself. All we want to know from the traits is which of the following applies: - the entry should be deleted - the entry should be kept and needs to be marked - the entry should be kept and is already marked (an optimisation of the previous case). The patch adds a keep_cache_entry function to provide this information. Series bootstrapped & regression-tested on x86_64-linux-gnu. Also tested with config-list.mk. Thanks, Richard