On 7/10/23 8:44 AM, H. S. Teoh wrote:
On Mon, Jul 10, 2023 at 09:30:57AM +0000, IchorDev via Digitalmars-d-learn 
wrote:
[...]
 From the spec it sounds as though (but good luck testing for sure)
that if you have (for example) 6 big dummy key-value pairs in the AA
to begin with, then if you use `.clear` it "Removes all remaining keys
and values from [the] associative array. The array is not rehashed
after removal, __to allow for the existing storage to be reused.__"
[...]

This is not an accurate understanding of what actually happens.  The AA
implementation consists of a primary hashtable (an array), each slot of
which points to a list of buckets. Clearing the AA does not discard the
hashtable, but does dispose of the buckets, so adding new keys
afterwards will allocate new buckets.  So the buckets used by the dummy
key-value pairs do not get reused without a reallocation.

This used to be the case, but in the latest implementation, the AA uses
open addressing. That is, if a slot is full, it linearly searches the table for the next open slot. This cuts down significantly on cache misses, since the hash slot itself stores the hash, so the pointer only needs to be followed if the hash matches.

Fun historical fact, the original AA used a tree to store colliding elements, which means that all AA keys had to support both opEquals and opCmp.

BUT, you are right that the elements themselves are individual blocks of memory, and only the bucket array is reused.

If you want to see how the hash table works, without having to deal with all the typeinfo acrobatics that the actual builtin AA uses, you can look at my newaa library:

https://github.com/schveiguy/newaa

This is the exact implementation of the AA internals, but uses compile-time introspection instead of typeinfo.

-Steve

Reply via email to