Re: organizing multiple dosyncs and ref-set vs. commute

Rich Hickey Mon, 25 Aug 2008 11:57:08 -0700

On Aug 19, 11:06 pm, Stuart Halloway <[EMAIL PROTECTED]>
wrote:
> Hi all,
>
> In order to deeply appreciate how Clojure removes the pain of lock-
> based concurrency, I am re-implementing the example code from Java
> Concurrency in Practice [http://www.javaconcurrencyinpractice.com/] in
> Clojure. Consider Listing 2.8, which demonstrates creating fine-
> grained locks around a simple cached calculation:
>
> // elided for brevity
> public class CachedFactorizer extends GenericServlet implements
> Servlet {
>      @GuardedBy("this") private BigInteger lastNumber;
>      @GuardedBy("this") private BigInteger[] lastFactors;
>      @GuardedBy("this") private long hits;
>      @GuardedBy("this") private long cacheHits;
>
>      public void service(ServletRequest req, ServletResponse resp) {
>          BigInteger i = extractFromRequest(req);
>          BigInteger[] factors = null;
>          synchronized (this) {
>              ++hits;
>              if (i.equals(lastNumber)) {
>                  ++cacheHits;
>                  factors = lastFactors.clone();
>              }
>          }
>          if (factors == null) {
>              factors = factor(i);
>              synchronized (this) {
>                  lastNumber = i;
>                  lastFactors = factors.clone();
>              }
>          }
>          encodeIntoResponse(resp, factors);
>      }
>
> }
>
> And in Clojure (stripping out the Servlet noise):
>
> (def *last-number* (ref nil))
> (def *last-factors* (ref nil))
> (def *count* (ref 0))
> (def *cache-hits* (ref 0))
>
> (defn cached-factor [n]
>    (dosync (commute *count* inc))
>    (or (dosync (if (= n @*last-number*)
>                 (do (commute *cache-hits* inc) @*last-factors*)))
>        (let [factors (factor n)]
>         (dosync
>          (do (ref-set *last-number* n)
>              (ref-set *last-factors* factors))))))
>
> A few questions/comments:
>
> (1) Are the ref-sets overly strict? Could commute be used instead? I
> am happy with last-in wins, so long as *last-number* and *last-
> factors* change in tandem.
> (2) In the Java example there are separate synchronized blocks to
> increase possible concurrency. I have tried to write the Clojure
> transactions similarly, in particular making sure the "expensive"
> calculation (factor) is outside a transaction. How can it be done
> better?
> (3) Bonus: How idiomatic is my Clojure?
>
> Thanks! I am truly enjoying Clojure.
>

This is the most efficient all-ref approach, from a contention
perspective:

(def hits (ref 0))
(def cache (ref {:n nil :factors nil}))
(def cache-hits (ref 0))

(defn cached-factor [n]
  (dosync (commute hits inc))
  (let [cached @cache]
    (if (= n (cached :n))
      (dosync (commute cache-hits inc)
        (cached :factors))
      (let [factors (factor n)]
        (dosync
         (commute cache (fn [_] {:n n :factors factors})))
        factors))))

It leverages a few aspects of Clojure's refs:

'Wild' reads are supported, i.e. you can read a ref outside a
transaction as long as you don't need to see a coordinated view with
other refs.

So, in this case, it is better to use a composite object for the
cache, so that a wild read will return a coordinated n and its
factors. N.B. that the cache is read only once, in the let. Sprinkling
multiple wild @ reads around won't work since each deref could return
a different value.

The ref-sets were needed in your example, but with a composite cache,
you can use the commute trick above (commute with a fn that ignores
its arg) to get last-one-in-wins, since this is just a last-value-only
cache.

As written above, none of the transaction in cached-factor will ever
retry, and the cache check is non-transactional - cache readers will
not wait for each other (as they will with the synchronized blocks in
the original example).

Your code is fine, and what's nice is it's easy to write multiple
correct versions, we're just optimizing here. I'm not sure I like the
Lisp *blah* convention for Clojure if the var is not going to be used
for dynamic rebinding.

Note that Clojure's immutable data structures work well with
java.util.concurrent, and that for this simple last-value cache, this
is the fastest possible way:

(def #^AtomicLong hits (AtomicLong. 0))
(def #^AtomicReference cache (AtomicReference. {:n nil :factors nil}))
(def #^AtomicLong cache-hits (AtomicLong. 0))

(defn cached-factor [n]
  (.incrementAndGet hits)
  (let [cached (.get cache)]
    (if (= n (cached :n))
      (do (.incrementAndGet cache-hits)
        (cached :factors))
      (let [factors (factor n)]
        (.set cache {:n n :factors factors})
        factors))))

A Java programmer might not think at first of using an immutable
composite for the cache, but a Clojure programmer (eventually) should,
at which point you can go for the lock-free version above. Note how it
has the same structure as the transactional Clojure version, and the
same caveat on reading the cache only once.

Rich
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To post to this group, send email to clojure@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/clojure?hl=en
-~----------~----~----~----~------~----~------~--~---
Re: organizing multiple dosyncs and ref-set vs. commute

Reply via email to