On Fri, 11 Apr 2025, Vivien Kraus <viv...@planete-kraus.eu> wrote: > Hello, > > Le vendredi 11 avril 2025 à 09:49 -0400, Olivier Dion a écrit : >> But some fields seems to >> not be taken into account and you will end up with fetching >> substitute >> in the store. I want to avoid this element of surprise in my case.
Side note, my problem is outside of Guix entirely. I just gave this as an example of caching behavior I want to avoid in my project. > From what I understand, here (problem A), you want to avoid a hash > collision (two different codes hashing to the same value)… Right. We don't want `(lambda (x y) (+ x y))' to have the same hash as `(lambda (x y) (* x y))'. Otherwise, users would get a big suprise to see that it get the same cached result. >> Like mentioned in the comment (look for the `(procedure? value)' >> clause), this does _not_ yield the same hash for a procedure if the >> module where this procedure came from is compiled. > > And here (problem B), you want to avoid having equivalent code (source > vs bytecode) hashing to different values. > > Did I summarize it correctly? I think the definition of equivalent here is that the pure behavior of the procedures are the same. Say we're hashing `(lambda (x y) (+ x y))' and `(lambda (u v) (+ u v))'. These procedures does not have equivalent sources, but I do have equivalent semantics. This is why the bytecode is hashed instead, because it represents the semantic of the procedure at the machine level. Alas, If you rehash the same procedure after compilation (often auto-compilation), you get a different hash of course. > Solving A is easy: generate a new unique identifier each time you hash > any functional value (or refuse to cache the result). This assumes the same hashing ordering from process to process. Hash must be identical for every invokation of Guile. But really A is not a problem here I think. > I don’t have an answer for B, but others may. I do have an idea. I'm willing to have `define#' and `lambda#' syntax-transformers. Users would use it to define procedures that are important to hash. The idea would be to evaluate the source of the procedure at expansion time and hash this procedure bytecode. The resulting hash is thus computed at expansion time and stored as a procedure property. Here's the implementation: --8<---------------cut here---------------start------------->8--- (use-modules (ice-9 match) (rnrs bytevectors) ((srfi srfi-69) #:prefix srfi:) (system foreign) (system vm program) ) (eval-when (expand load compile) (define (procedure-bytecode proc) (match (program-address-range proc) ((start . stop) (pointer->bytevector (make-pointer start) (- stop start))))) (define (hash-bytevector bv) (let loop ((acc (srfi:hash #vu8())) (k 0)) (if (= k (bytevector-length bv)) acc (loop (+ acc (* (1+ k) (srfi:hash (bytevector-u8-ref bv k)))) (1+ k))))) (define (hash-expression formals body body*) (false-if-exception (hash-bytevector (procedure-bytecode (eval `(lambda (,@formals) ,body ,@body*) (current-module))))))) (define-syntax define# (lambda (stx) (syntax-case stx () ((_ (name formals ...) body body* ...) (identifier? #'name) (let ((the-hash (hash-expression (syntax->datum #'(formals ...)) (syntax->datum #'body) (syntax->datum #'(body* ...))))) (with-syntax ((set-hash! (if the-hash #`(set-procedure-property! name 'hash #,the-hash) #'(begin)))) #'(begin (define* (name formals ...) body body* ...) set-hash!))))))) (define-syntax lambda# (lambda (stx) (syntax-case stx () ((_ (formals ...) body body* ...) (let ((the-hash (hash-expression (syntax->datum #'(formals ...)) (syntax->datum #'body) (syntax->datum #'(body* ...))))) #`(let ((proc (lambda (formals ...) body body* ...))) #,(if the-hash #`(set-procedure-property! proc 'hash #,the-hash) #'(begin)) proc)))))) (define (procedure-hash proc) (or (procedure-property proc 'hash) 0)) (define# (foo x y) (+ x y)) (define# (bar x y) (+ x y)) (pk (procedure-hash foo) (procedure-hash bar) (procedure-hash (lambda# (u v) (+ u v))) ) --8<---------------cut here---------------end--------------->8--- > However, considering problem B, it may be more complex than just > source and bytecode. What if the GOOPS object creator uses different > dependencies, but they turn out to have the same behavior? I’m > thinking about a non-free/free replacement situation. Since the object > creator has a clearer vision of the functions and dependencies they > use, maybe you could ask them to provide a version number of their > relevant functional code, and hash this version number instead? I'm not sure to understand your concern here about non-free replacements. Could you expand on this by giving an example? Thanks, Olivier -- Olivier Dion oldiob.ca