On Thu, May 26, 2011 at 10:28 PM, Armando Blancas
<armando_blan...@yahoo.com> wrote:
> Coercing m to int somehow prevents h from being boxed:
>
> user=> (defn bin-search [v k c]
>  (loop [l 0
>         h (dec (count v))]
>    (if (> l h) false
>        (let [m (int (quot (+ l h) 2))
>              m-v (v m)]
>          (cond (> m-v k) (recur (inc m) h)
>                (> k m-v) (recur l (dec m))
>                :else m)))))
> #'user/bin-search
> user=>

I doubt it. Rather, m is now an int so (dec m) can be a valid recur for h.

The real question is why h was primitive to begin with, since there's
no coercion of h to int in either version of the code.

To make this really fast you will want to use int throughout, though,
and unchecked arithmetic:

(defn bin-search-fast [v k]
  (loop [l (int 0)
         h (int (dec (count v)))]
    (if (> l h) false
      (let [m (unchecked-divide (unchecked-add l h) 2)
            m-v (v m)]
        (cond
          (< m-v k) (recur (unchecked-inc m) h)
          (< k m-v) (recur l (unchecked-dec m))
          :else m)))))

This works on ascending values:

=> (bin-search-fast (vec (map #(* % %) (range 1000))) 4096)
64

And despite the proliferation of fast, wrapping unchecked-foo
operations, it is perfectly safe.

The thing that would overflow and wrap first is the sum of l and h.
The worst case occurs when the target's right at the end of the
vector, and l ends up nearly equal to h, which stays one less than the
length of the vector. Then we're adding length - 1 and length - 2 for
2*length - 3. This has to stay below 2147483647, so the maximum vector
length this can support is 2147483650/2 or 1073741825.

Over a billion items; that takes up 4GB just for the object pointers
in the vector. You cannot hit this limit on 32-bit hardware -- OOME
would be thrown first -- and on 64-bit there'd be no penalty from
changing the code to use long instead of int, which makes the maximum
vector size enormously larger than you're likely to use in the
foreseeable future even on 64-bit hardware.

The speed difference is about a factor of two:

=> (defn bin-search-slow [v k]
     (loop [l 0
            h (Integer/valueOf (dec (count v)))]
       (if (> l h) false
         (let [m (quot (+ l h) 2)
               m-v (v m)]
           (cond
             (< m-v k) (recur (inc m) h)
             (< k m-v) (recur l (dec m))
             :else m)))))
#'user/bin-search-slow
=> (def squares (vec (map #(* % %) (range 1000))))
#'user/squares
=> (time (dotimes [_ 1000] (bin-search-slow squares 4096)))
"Elapsed time: 1.37656 msecs"
=> (time (dotimes [_ 1000] (bin-search-fast squares 4096)))
"Elapsed time: 0.7608 msecs"

(Each timing is after several repetitions of the (time ...) line
preceding it, once the numbers have settled down as the JIT has done
its thing, on the hotspot server VM.)

-- 
Protege: What is this seething mass of parentheses?!
Master: Your father's Lisp REPL. This is the language of a true
hacker. Not as clumsy or random as C++; a language for a more
civilized age.

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Reply via email to