Hi Guile users!

I have a hashing procedure that can hash many things like so:

  (define (hash value)
    (cond
     ((pair? value)
      (let loop ((elements value)
                 (acc (srfi:hash '()))
                 (k 1))
        (if (null? elements)
            acc
            (loop (cdr elements)
                  (+ acc (* k (hash (car elements))))
                  (1+ k)))))
     ((vector? value)
      (let loop ((acc (srfi:hash (vector)))
                 (k 0))
        (if (= k (vector-length value))
            acc
            (loop (+ acc (* (1+ k) (hash (vector-ref value k))))
                  (1+ k)))))
     ((bytevector? value)
      (let loop ((acc (srfi:hash #vu8()))
                 (k 0))
        (if (= k (bytevector-length value))
            acc
            (loop (+ acc (* (1+ k) (hash (bytevector-u8-ref value k))))
                  (1+ k)))))
     ;; FIXME:  The hashing of a procedure is not the same after compilation.
     ;; ((procedure? value)
     ;;  (match (program-address-range value)
     ;;    ((start . stop)
     ;;     ;; Adding the hash of 'procedure so that it does not give the same 
hash
     ;;     ;; for an identical bytevector.
     ;;     (hash (+ (srfi:hash 'procedure)
     ;;              (hash (pointer->bytevector (make-pointer start)
     ;;                                         (- stop start))))))))
     ((or (number? value)
          (string? value)
          (symbol? value)
          (boolean? value))
      (srfi:hash value))
     ((instance? value)
      (let ((class (class-of value)))
        (fold
         (lambda (slot acc)
           (+ (if (hash-slot? slot)
                  (let ((name (slot-definition-name slot)))
                    (* (hash name) (hash (slot-ref value name))))
                  0)
              acc))
         (hash (class-name class))
         (class-slots class))))
     (else
      0)))

Like mentioned in the comment (look for the `(procedure? value)'
clause), this does _not_ yield the same hash for a procedure if the
module where this procedure came from is compiled.  And it's not
possible to pass the procedure (a reference) to `compile' to get a
compiled version.  I assume that the compiler would need lexical context
for couple of its optimization anyway, so even if it did worked, we
would not get the same bytecode.

Note that hashing the source code won't work either because:

  1. If not compiled, there is no source location available, thus not
  usable in a REPL.

  2. Hashing of the source does not make sense in my point of view.
  Adding a newline to a function does not change its bytecode and should
  yield the same hash.

So I wonder if somebody has an idea on how to have reliable hash of
procedure. That is, a given procedure will yield the same hash in
different Guile processes, compiled or not.

My goal here is that I have GOOPS object.  The object is used to produce
a pure result which I can store in a cache on the disk, given the hash
of the object.  I can then re-fetch the result on disk in another Guile
process if the hashes match.  As you can see in the above code, GOOPS
instances get hashed by folding over their slots, which can include
procedures.

If you have ever used Guix and change the value of a package field and
ask to build it, usually it will rebuild it.  But some fields seems to
not be taken into account and you will end up with fetching substitute
in the store.  I want to avoid this element of surprise in my case.

Thanks,
Olivier
 
-- 
Olivier Dion
oldiob.ca


Reply via email to