Hi Simon -

Thanks for the detailed reply, my responses are in-line below.

~Thomas

On Wed, Oct 30, 2019 at 7:52 AM Simon Schlee <schleesi...@gmail.com> wrote:

> Racket has mutable and immutable hash tables these are implemented
> differently.
> Because as a user of hash tables you have to differentiate between mutable
> and immutable,
>
I dug into the source a little bit here: for posterity, the immutable hash
"tables" are actually HAMTs.
There's no particular reason why the mutable hash tables couldn't also be
HAMTs, but the fact that they're not suggests a good reason at least for
why there wouldn't be an efficient direct conversion from mutable maps and
sets to immutable maps and sets.


> Immutable hash tables are excellent for your use case, because you can
> easily keep around the original version just by keeping a
> variable/reference to it.
> As its name implies it is immutable/not changed, so when you "update" the
> hash that you are using as a set you aren't really updating it, instead you
> create a new updated hash from it, while the original version remains as it
> was. Because the immutable hash is implemented as a persistent
> data-structure with structural sharing, this is efficient,
>
I'm reasonably familiar
<https://repository.library.brown.edu/studio/item/bdr:918868/> with the
semantics of persistent data structures =)


> because it doesn't copy everything but instead only copies a few small
> internal nodes, those which need to be changed, the rest refers to the
> old/original version.
> Depending on the racket version there are different internal
> implementations of immutable hashes,
> but you can expect operations like hash-set to create a small extended
> version of your hash (if you are using racket's immutable hashes) that
> doesn't copy the complete hash.
>
In my use case, persistence will result in *O(N log N)* unnecessary
allocations, since the root-to-leaf path will be rewritten *N* times as I
drain the set.
Threading the state in and out of the function calls implementing my loop
will also make my code a lot more verbose for no particular gain, as
opposed to just wrapping a mutable set in a closure.


> Immutable hash tables actually provide O(log N) access and update. Since N
>> is limited by the address space so that log N is limited to less than 30
>> or 62 (depending on the platform), log N can be treated reasonably as a
>> constant.
>>
> from https://docs.racket-lang.org/reference/hashtables.html
>
This is actually six years out of date and should be updated: a HAMT is
typically bounded to depth 6 or 7.


> Another possibility is to create a custom-set implementation which wraps
> the hash to appear like a set, here is an example implementation:
>
> #lang racket
>
> (provide (contract-out
>           [proxy-set (-> (and/c hash? immutable?) generic-set?)])
>          proxy-set->set
>          proxy-set->seteq
>          proxy-set->seteqv)
>
> (require racket/struct)
>
> (struct proxy-set [hash]
>   ;; #:transparent
>   #:methods gen:set
>   [(define (set-member? s k)
>      (hash-has-key? (proxy-set-hash s) k))
>    (define (set-add s v)
>      (proxy-set (hash-set (proxy-set-hash s) v #f)))
>    (define (set-remove s v)
>      (proxy-set (hash-remove (proxy-set-hash s) v)))
>    (define (set-empty? s)
>      (hash-empty? (proxy-set-hash s)))
>    (define (set-count s)
>      (hash-count (proxy-set-hash s)))
>    (define (set->stream s)
>      (sequence->stream (in-immutable-hash-keys (proxy-set-hash s))))]
>
>   #:methods gen:custom-write
>   [(define write-proc
>      (make-constructor-style-printer
>       (lambda (s) 'proxy-set)
>       (lambda (s) (in-immutable-hash-keys (proxy-set-hash s)))))])
>
> ;; these functions are in case you eventually don't need the proxy anymore
> ;; and want to convert to a plain racket set
> (define/contract (proxy-set->set s)
>   (-> proxy-set? (and/c generic-set? set-equal? set?))
>   (for/set ([k (in-immutable-hash-keys (proxy-set-hash s))])
>     k))
>
> (define/contract (proxy-set->seteq s)
>   (-> proxy-set? (and/c generic-set? set-eq? set?))
>   (for/seteq ([k (in-immutable-hash-keys (proxy-set-hash s))])
>     k))
>
> (define/contract (proxy-set->seteqv s)
>   (-> proxy-set? (and/c generic-set? set-eqv? set?))
>   (for/seteqv ([k (in-immutable-hash-keys (proxy-set-hash s))])
>     k))
>
> (define-syntax-rule (show x)
>   (displayln (format "~a: ~a" (quote x) x)))
>
> (define (example)
>   (define original-map (hasheq 'a 4 'b 7 'c 1))
>   (show original-map)
>
>   (define extended-map (hash-set original-map 'another 42))
>   (show extended-map)
>
>   (define derived-set (proxy-set original-map))
>   (show derived-set)
>
>   (define extended-set (set-union derived-set (seteq 'd 'e)))
>   (show extended-set)
>
>   (displayln "after all that the original map remains unchanged")
>   (show original-map)
>   (displayln "and the extended set does not modify the derived set")
>   (show derived-set)
>
>   (displayln "you also can merge the new keys from the extended-map")
>   (displayln "with the extended-set at a later point")
>   (define merged-set (set-union extended-set (proxy-set extended-map)))
>   (show merged-set)
>
>   (displayln "or convert the proxy-set to a plain racket set")
>   (define plain-set (proxy-set->seteq merged-set))
>   (show plain-set))
>
> (module+ main
>   (example))
>
> ;; output
> ;; original-map: #hasheq((a . 4) (b . 7) (c . 1))
> ;; extended-map: #hasheq((a . 4) (another . 42) (b . 7) (c . 1))
> ;; derived-set: #<proxy-set: a c b>
> ;; extended-set: #<proxy-set: a e d c b>
> ;; after all that the original map remains unchanged
> ;; original-map: #hasheq((a . 4) (b . 7) (c . 1))
> ;; and the extended set does not modify the derived set
> ;; derived-set: #<proxy-set: a c b>
> ;; you also can merge the new keys from the extended-map
> ;; with the extended-set at a later point
> ;; merged-set: #<proxy-set: a e another d c b>
> ;; or convert the proxy-set to a plain racket set
> ;; plain-set: #<seteq: a e another d c b>
>
This is a nice feature of the Racket libraries, thanks for illustrating it.

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/racket-users/CA%2BHWosXtZBmGj9Y1nDd8Xwoac9h%3Df57%3DKAb0QSEeyHXzDj5f_w%40mail.gmail.com.

Reply via email to