Hi All, As I told you in an earlier mail I'm back cleaning up and reworking guile-log and refreshing the memory of the inner details of that code base enabled me to rewrite the spec for redo safe variables considerable. I think that it is much cleaner now and should be worthy of a good discussion.
WDYT? ----------------------------------------------------------------------------------------------------------------------- Dear editors of srfi, SUMMARY: ======== Reinstating continuation in order to capture a state and later restart from the same state is a concept that at least in proof assistants and similar software with user interaction is important. But also direct algorithms in logic programming can take advantage of this concept. Unfortunately scheme is partly broken when trying to implement these constructs effectively with little effort and hence it could be an idea to try improve on the issue. This srfi describes semantics that can help improve the current state without sacrificing code clarity and the elegance of scheme. We will refer to this concept as redo semantics and hence name the new variable types as redo safe variables and a new parameter type redo safe parameters. Also we should distinguish between different kinds of undo Capability, full capability or region guarded capability. The basic idea to conceptually implement this is to keep a list of guarded variables and parameters and then use this to guide storage and retrieval of states. The complexity comes from managing this list and keep it's size small. The main trick to accomplish this is to intelligently add and remove guarded variables depending on where the location of mutation on the stack is. Also one can target the guarding towards a special undo usage pattern and e.g. save only on backtracking over specific regions of the stack. AUTHORS BACKGROUND: =================== I'm an active developer of scheme centered around the guile scheme ecosystem and the developer of guile-log, a logic programming system on top of guile. Guile log implements all of kanren but for efficiency reasons and also semantically reasons are able to use stack based logic programming as well as association list or functional tree versions of assoc for managing variable bindings and implement the proper backtracking. One concept that this enables is guarded variables. A SHORT BACKGROUND ON THE ISSUE: ================================ The background has a plurality of origins that converged into one concept. 1. Current scheme is a beautiful language but has a drawback. If you set! a variable that variable will be boxed and any uses of redo semantic becomes broken. This is not acute in scheme because we try to avoid set! and that is well. But there are cases where you would like to use set!. One example being utilizing a simple generator e.g. (next! n) -> (begin (set! n (+ n 1)) n) This can many times be optimized efficiently and with a good semantic in a looping macro for example but it will not be scheme. To efficiently achieve that without using set! seams to be difficult. Also the resulting loop! macro will not mix well with redo features on top of the system. What one would like is to have unboxed variables if the variables are local only. The problem is that the theoretical motivation behind keeping local variables unboxed is unsound at first sight because you will have different semantics for local variables compared to if you put a non local one in a lambda. Now by viewing these variables as redo safe variables we can treat mutating non local variables as well as non boxed mutating local ones into one semantic meaning which means there is a sound theoretical semantic for this optimized variables e.g. being redo-safe variables. 2. Emacs lisp is coded very much in an imperative style and with guile we would like to sell guile scheme as a back end for emacs lisp and emacs as a whole. One of the selling point to do this could be that we could use continuations to seamlessly add undo redo like semantics to already written programs in emacs lisp. The problem is twofold here. i) Typically emacs variable are dynamical variables. Dynamic variables in scheme is modeled on parameters and they posses the property that for each dynamic state S and parameter p the evaluation of p under S is always the last value setted e.g. it is working like a local variable in the S environment. Because each instance of a continuation get's it's own dynamic state, the possibility of freezing a state that can be restarted multiple times seams to be there assuming there is no other non local exits. The problem is that a dynamic state can be referenced as a scheme object and used as the current state multiple times as well for the same continuation and hence we must overwrite the initial value at a normal return from the dynamic state and hence loose the ability to store and restore naturally. ii) Scheme local variables does also show up in the compilation of emacs lisp to guile. The problem with this is that typically lisp programs in emacs is littered with set!-ing of local variables. And the semantic for this in scheme is to box them and again you loose the ability to redo and undo without a hefty reprogramming or write an advanced compiler to scheme - which is typically not a beautiful solution. Again a good concept for enabling seamless and efficient and well designed code that can be undone and redone multiple times is needed. 3. Guile log is a combination of both traditional stack based logic programming concept and the kanren concept and together with a separate continuation implementation targeted for logical variables. To be able to code all of kanrens concept including a stack as well for speed and extra semantics like easy tracing or more generally to have a dynamic-wind like idioms. The ability to redo and undo is really helpful and enables interesting logic programs to be written. To accomplish this effectively the stack is needed So here it was found out that having variables that dynamically can change between behaving like normal variables or during other dynamical context behave like a state preserving variable yield very satisfying semantics. So guile log uses a separate stack(s) that have quite a lot of optimization's and one of them is efficient redo safe variables. Knowing the usefulness of the concept for advanced logic programming, it seams interesting to add the same concept as a srfi to increase scheme's ability to express similar algorithms. RATIONAL FOR LIFTING THIS TO A SRFI REQUEST: ============================================ The concept have been proven useful and also implementing this in the right way do touch on the fundamentals of scheme in such a way that it, from the perspective of the author, needs help from the greater scheme community in order to improve upon the idea leading to a good specification. Also having it as a srfi will mean that redo semantics will have the potential to be better served in a portable manner. The main drawback refers to making the ability to reason about code incorporating different kinds of variables one way to mitigate this could be to mark variables binding to items with this complexity like we do with mutation. But a general discussion and idea creation how to be able to make code transparent would be welcomed. MAIN CAVEATS: ============= 1. A good implementation of this concepts that are lightweight and will not disturb the speed of scheme programs much, this means that the targeted VM instructions and compilers might need to change. 2. The functionality without a good language support risk of messing up the ability to reason about scheme code. Therefore we want support in scheme to make it possible to introduce this concept in a sane way. SUGGESTED SPECIFICATION: ======================== In order to specify the logic we need: 1. dynamic-wind from r5rs. call/cc from r5rs, and parameters from srfi-39 to mainly to have thread local variables. 2. It is suggested a new variable kind will be called a redo safe variable and a new parameter object called redo safe parameter and the concept will be marked with a ~, like with ! a macro or function that spread this concept should be marked with ~. Else if the semantic follow what is expected by the standard use the usual symbol nomenclature. 3. We will need a concept of a redo safe wind level. It is a bit more involved then just a number. Anyhow with this concept goes a parameter that we denote k for the actual wind level of the guard and K for the wind-level of the reinstating of the continuation. So when we guard a variable we will associate a k with that guard and when reinstating a redo-safe continuation we will need to supply a parameter K that represent the Wind level of the instance. 4. The logic for restoring a state or not in the wind levels k and K are in order, i) if k is a boolean value then that decides the restore or not ii) if K is #t then we will restore iii) if K and k is a number then we will restore a state if K < k iv) if K is a procedure then K should be a predicate and (K k) decides if we should restore or not. v) else do not restore. The rational for i) is that this enables important optimizations that can be done if the compiler knows that k is #t or #f. The rational for ii) is that we can force a re-storage of the state that is convenient in the important case of redo and undo semantics regarding user interaction logic for e.g. proof assistants. The rational for iii) is that in e.g. logic program we might want to keep a variable that behaves just as a variable regarding the logic of the idiom in question e.g. postpone semantics, and that the body of the code in this idem with redo safe data be stored and restored multiple perhaps times. Then an enclosing idem could fire a restore at a higher level hence the name redo safe wind level and at that time the variables that previously behaved as ordinary variables would restore it's state. The rational for iv) is just to make the notion extensible to logic not foreseen by the author. 5. The main concept we will use here is a thread local list, redolist that represents the current list of guards for different variables. The list is a logical set not necessary a scheme list. 6. The variable arguments to the idioms below can be either parameters or scheme variables, we will use the term variables for both these concepts for now in the spec, the reason is that other overhead should be more important then the introduction of a check for parameter or variable in the code, it is indeed more important to be able to batch them together and hence treat variables and parameters with the same constructs. 7. The lowest level code will just take a variable and make a guard object of that variable. In essence we will at this moment fix the k wind level and produce an identity that we can place in different guard operations. So we will have a macro, (make-redo-safe-guard var k) This creates a guard object identity with the frozen wind level k and var identity. To Note here is for parameters the guarded variable the guard should refer to the current active parameter slot e.g. if we later issue a with-parameters ... idiom we will introduce a new slot associated with the parameter and hence to redo guard it we must issue a new redo-safe-guard "dance". The main operations are, 1 (add-variable-guard f) Push guard identity f onto the front of the redo list 2 (remove-variable-guard f) Removes the guard identity f from the redo list preserving the order of the other guards on the list 8. We will need to be able to store a continuation and later restore it and by doing that maybe restore some of the guarded variables. Now we introduce a new call/cc e.g. (redo-safe-call/cc cc) As usually cc should be a lambda e.g. (lambda (cont) ...) The difference from standard call/cc is that 1. at the call of redo-safe-call/cc, the redo-list is stored and is traversed for all guarded variables and the by the guarded variables and parameters referenced value will be stored. 2. cont is now a "lambda" of the form (lambda (K) ...) E.g. when we instantiate it, it has the reinstation wind-level as a parameter. 3. When the cont is instantiated, the standard call/cc reinstation will take place and after that the saved redo-list will be traversed and the stored values will be restored into the different variables. 9. With this we have a minimal working system to implement redo-safeness. But it has a drawback. The redo-list can become really huge, and using the idioms correctly and effectively can be difficult. The opportunity to improve on this is that quite often one have good control on where on the stack layout we do mutation. Let Set! represent an location of the stack where we do mutation and mark with X non mutation regions. Also let Define represent the definition of the variable and we will assume that the variable is never referenced "below" the Define, then a common idiom is a) Define Set! X ... b) Define Set! X ...' Set! X ... Set! '' X ... We will concentrate on redo-safeness for state storage after the Define on the stack. In case the variable get pushed into a lambda and later used although the control has passed below Define in the stack and there gets mutated and stored we refer to the lower level framework. And generally is difficult to automate in a sensible way. For a) we note that we really just need to guard the variable directly after the Set! and put the necessary logic in a dynamic wind there an optimization and hence keep the redo-list small. Not only this, by using this idiom for case b) we can still support a weaker form of redo safeness e.g. for cases when we basically always when we store a state also backtrack back over the guard. With this logic guarded parameters would mean essentially just the ordinary parameters but in some implementations e.g. guile there is a small extra overhead to make the semantic mix with other features e.g. the ability to swap dynamic state. For b) We need between ' and '' to have a guard on the redo-list, but not after '' where we could as in a) in stead again add a guard on the dynamic stack. So in this light the following idioms seams natural: i) guards (redo-safe-variable-guard guard ((v k) ...) code ...) Here v and k will be freezed for guarding and corresponding guard identities will be made the symbol will be used as (guard thunk) In thunk if the guarded variable(s) are not mutated then it is guaranteed that any stored state inside thunk will be correctly restored. If there is no mutation between the storage operation and the control leaving the thunk the semantic would be redo-safe in the strong sense. Else the redo safeness is not specified and implementations as the reference implementation below can make it redo safe, or a more optimized implementation could chose to have the weak form implemented. ii) LR guards, (redo-safe-variable-lr-guard left-guard right-guard ((v k) ...) code ...) Here v and k will be freezed for guarding and corresponding guard identities will be made the two symbols will be used as a guard and used as (left-guard thunk) (right-guard thunk) The semantics is that when control enters the left-guard thunk the guard identities will be added to the redo-list and when it leaves, the guard identity will then be removed from the list. For the right guard the behavior of the redo-list is reversed with the addition of the guard behavior in i) This finishes the specification. HOW DOES THIS MEET THE EXAMPLES THAT WE STARTED FROM? ===================================================== a) For the case with code where set! is needed and the data is mainly local data one would typically use something like (let ((v val) ...) (redo-safe-variable-lr-guard lguard rguard ((v #t) ...) (lguard thunk))) A well designed compiler would then see that the local variables can be put onto the call frame unboxed and the code would be very efficient. If one places the variable in a lambda and it is mutated it will be boxed and an appropriate guard will be added, one also one can optionally expose rguard in such a way that one can prevent an explosion of guards. This concept diverges if we consider state storage after the control leaves the let and v is put in a live mutating lambda. b) For integrating undo/redo semantics in emacs with guile-scheme one could extend elisp with some compiler directives like #parameter-guard-semantic = #t / #f #parameter-lguard? #variable-guard-semantic = #t / #f #variable-lguard? With implementing the same guard as a) for k = #t and skipping it altogether with k = #f. The kind of guard could be controlled by the lguard? versions which meas use lguard (#t) or guard (#f) as a guard. The with e.g. #parameter-guard-semantic #t #parameter-lguard? #f #variable-guard-semantic #t #variable-lguard? #t in the source file we would have efficient lightweight guarding with really low overhead or sometimes even less overhead (local variables oriented code) but still have the weak form of redo safeness. c) To implement the same logic as in guile-log but using delimited continuation and a function stack is with this concept doable (although you will lose quite a lot of optimizations) Anyhow this imply that scheme with this addition is more expressible. As an example in a logic program like in kanren having a idiom that fold each element into a accumulator is difficult to do in a redo safe way. So say these accumulators are used in a setup where you take a breath first search together with subset selection of active continuations which use again variables and then another postpone like idiom is working on-top of this and another one etc. getting this complex creature will be a breeze with redo safe variables and redo safe parameters. REFERENCE IMPLEMENTATION: ========================= Assume that we have syntax-rules macrology system together with a parameters (srfi-39) with the addition of the reference implementation therein of an idiom: (get-parameter-cell parameter) which returns a cons cell with the cdr being the parameters current value and by setting the cdr we will change the parameter value at that frozen state. (define redolist (make-parameter '())) (define (add-variable-guard f) (redolist (cons f (redolist)))) (define (remove-variable-guard f) (redolist (let loop ((l (redolist))) (if (pair? l) (if (eq? (car l) f) (cdr l) (cons (car l) (loop (cdr l)))) '())))) (define (do-restore? k K) (cond ((eq? k #t) #t) ((eq? k #f) #f) ((eq? K #t) #t) ((number? K) (and (number? k) (< K k))) ((procedure? K) (K k)) (else #f))) (define-syntax make-redo-safe-guard (syntax-rules () ((_ v k) (let ((kk k)) (if (parameter? v) (let ((cell (get-parameter-cell v))) (lambda () (let ((V (cdr cell))) (lambda (K) (if (do-restore? k K) (set-cdr! cell V)))))) (lambda () (let ((V v)) (lambda (K) (if (do-restore? k K) (set! v V)))))))))) (define-syntax add-fnames (syntax-rules () ((_ fname ((v k) . l) (m ...) (u ...) th . l) (let ((f (make-redo-safe-guard v k))) (add-fname l (m ... f)) (f u ...) th . l)) ((_ fname () l u th . l) (fname l u th . l)))) (define-syntax redo-safe-variable-lr-guard (syntax-rules () ((_ left right l code ...) (add-fname redo-safe-variable-lr-guard-0 l () () th left right code ...)))) ;; We make a simplistic implementation and only use a strong ;; guard here for the stack. (define-syntax redo-safe-variable-lr-guard-0 (syntax-rules () ((_ (f ...) (f2 ...) thunk lguard rguard code ...) (let ((lguard (lambda (thunk) (dynamic-wind (lambda () (add-variable-guard f2) ...) thunk (lambda () (remove-variable-guard f) ...)))) (rguard (lambda (thunk) (thunk)))) code ...)))) (define-syntax redo-safe-variable-guard-0 (syntax-rules () ((_ (f ...) (f2 ...) thunk guard code ...) (let ((guard (lambda (thunk) (dynamic-wind (lambda () (add-variable-guard f2) ...) thunk (lambda () (remove-variable-guard f) ...))))) code ...)))) (define-syntax redo-safe-variable-rguard (syntax-rules () ((_ g l code ...) (add-fname redo-safe-variable-guard-0 l () () g code ...)))) (define (redo-safe-call/cc cc) (let ((L (map (lambda (f) (f)) (redolist))) (K #f)) (call/cc (lambda (cont) (let ((Cont (lambda (KK) (set! K KK) (cont)))) (cc Cont))) (for-each (lambda (f) (f K)) L))))