RFC: laziness-safe, semi-dynamic environment Var(Lite)

jon Tue, 29 Sep 2009 13:04:51 -0700

Hi.. long post.. it's a Request For Comment :)

Clojure's "thread local binding facility" for Vars has always seemed
like
a useful (but of course misusable) feature to have available in our
toolbox..
However it soon becomes apparent that Var bindings don't play nice
with
laziness - and since laziness can creep in all over the place
(eg. using standard sequence functions, direct use of (lazy-seq ..),
  using (delay ..), perhaps from any (fn ..) you create)
that renders them of much less value.. almost too unsafe to tangle
with
in my opinion.
To quote Rich, "there is a fundamental tension between laziness and
dynamic
scope, combine with extreme caution."


The underlying problem is that when each (fn ..) gets created it
doesn't
capture the current dynamic environment so that it can subsequently be
made
available when it is eventually invoked.
To quote Rich once more, "The overhead for capturing the dynamic
context
for every lazy seq op would be extreme, and would effectively render
dynamics non-dynamic."

To help alleviate the problem somewhat, a (bound-fn ..) helper macro
has
been created (https://www.assembla.com/spaces/clojure/tickets/170)
but my guess is that its use would be impractical/ugly/risky..
it would need to be used "all over the place" and forgetting to use it
in any of those places could introduce a bug.

I've been thinking about an alternative to (bound-fn ..) and would
like your opinions on the following tweak to Clojure:

* Designate one "Var" (say clojure.core/*env*) as a special
"environment"
  Var. It would either be bound to nil or something-non-nil (normally
a map).
* Modify clojure's implementation behind (fn ..) to do a light-weight
  version of what (bound-fn ..) does -- ie.on instantiation, capture
the
  current value of *env*, and when invoked, wrap the execution in a
  bind/unbind of *env* with the captured value, but only if non-nil.
* Create a (with-env {...} ...) helper macro.
* Developers, just need to make sure not to wrap (with-env ..) around
code
  that /loads/ their software (only around code that /runs/ their
software).

As a proof-of-concept, I implemented this the most simple, hackish
way,
but it seems to work quite well. The main details follow:

-In RT.java
  add a public static 'ENV' field (similar to IN, OUT, etc) associated
to
  clojure.core/*env* with root binding of nil

-In both RestFn.java and AFn.java:
  add a new private 'env' field.
  set 'env' to the deref of RT.ENV in the constructor.
  rename *all* invoke() methods to invoke0().
  add corresponding new invoke() methods for each invoke0() in order
  to intercept execution.
  Example:
  -public Object invoke(Object arg1, Object arg2, Object arg3) throws
Exception{
  -  return throwArity();
  -}
  ---
  +public Object invoke0(Object arg1, Object arg2, Object arg3) throws
Exception{
  +  return throwArity();
  +}
  +public Object invoke(Object arg1, Object arg2, Object arg3) throws
Exception{
  +  try {
  +    if (env != null)
  +      <...something to push 'env' value onto RT.ENV...>;
  +    return invoke0(arg1,arg2,arg3);
  +  }
  +  finally {
  +    if (env != null)
  +      <...something to pop 'env' value off RT.ENV...>;
  +  }
  +}

-In Compiler.java:
  make the following change so that (fn ..) objects override
  invoke0() instead of invoke().
  -    Method m = new Method(isVariadic() ? "doInvoke" : "invoke",
  +    Method m = new Method(isVariadic() ? "doInvoke" : "invoke0",

-------Example of it working-------
 user=> (def *other* {:addval 1})
 #'user/*other*
 user=> (map #(+ % (:addval *other*)) [1 3 5 7 9])
 (2 4 6 8 10)      ;<---AS EXPECTED.
 user=> (binding [*other* {:addval 10}]
                (map #(+ % (:addval *other*)) [1 3 5 7 9]))
 (2 4 6 8 10)      ;<---OOPS. BINDING DISAPPEARED.
 user=> *env*
 nil               ;<---DEFAULTS TO nil
 user=> (with-env {:addval 10}
                (map #(+ % (:addval *env*)) [1 3 5 7 9]))
 (11 13 15 17 19)  ;<---GREAT. BINDING WAS REMEMBERED.
-------------------------------------

Now what about the overhead? Based on a little initial testing...
when using a regular Var to implement our special *env*,
when *env* is not utilized (ie.left bound to nil) the overhead appears
to be negligible, but when bound it is quite significant..
Consuming this 30-million entry lazy list:
  (time (last (map identity (range 30000000))))
with *env* unbound = ~18 sec
with *env* bound   = ~65 sec

However, if we choose to create clojure.core/*env* not referring
to a Var but something else (unfortunately Clojure is extremely
inextensible in this regard) -- we can instead invent and use
a "lighter weight Var" because we something simple is adequate.
I experimented by creating a VarLite class.
It extends Var (had to change Var to be non-final) and manages
the pushing/popping of its value with a simple stack (just for
itself) with a bit of caching, and dispenses with Validators, etc.
This reduces the overhead dramatically:
  (time (last (map identity (range 30000000))))
with *env* unbound = ~18 sec
with *env* bound   = ~22 sec

With a better integrated, better designed implementation, I'm
certain this could be improved further.
In that case, would this be a worthwhile enhancement to Clojure?
Seems like it could be a win-win situation, since it rescues
(semi-)dynamic bindings from the gnashing jaws of laziness
for those that want to use it, but shouldn't impact negatively
upon those that don't?
Or is there something fundamentally wrong with the idea?

Thanks for reading,
Jon

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~----------~----~----~----~------~----~------~--~---

RFC: laziness-safe, semi-dynamic environment Var(Lite)

Reply via email to