On Sat, Nov 6, 2010 at 11:28 AM, thattommyhall <thattommyh...@gmail.com> wrote:
> Hey,
>
> I read http://www.cs.hmc.edu/~oneill/papers/Sieve-JFP.pdf a while ago
> and noticed you had referenced the paper and implemented the naive
> algorithm in contrib and thought I could chip in and do the sieve for
> you but get an error in my current implementation with only a million
> primes.
> "Exception in thread "main" java.lang.OutOfMemoryError: Java heap
> space (core.clj:1)"
>
> The code is at 
> https://github.com/thattommyhall/primes/blob/master/src/primes/core.clj
> , its only my second week in clojure so I'm not sure im using lazy-seq
> correctly, if sorted-map is the correct datastructure to use or if I'm
> missing something else obvious.

What you're missing that's arguably obvious is that there's a lot of
storage space overhead for Clojure's sorted-maps and other data
structures. Here's a Clojure sieve that works fine up to at least 50
million and is reasonably fast (though the same algorithm implemented
in Java is 10x faster even with the type hints and primitive
arithmetic in the Clojure; not sure why).

(defn primes-to [n]
  (let [#^booleans sieve (boolean-array (inc n) true)
        n (int n)]
    (loop [p (int 2)]
      (let [pp (unchecked-multiply p p)]
        (if (> pp n)
          (filter identity (map #(if %1 %2) (drop 2 sieve) (iterate inc 2)))
          (do
            (loop [i pp]
              (aset sieve i false)
              (let [j (unchecked-add i p)]
                (if (<= j n)
                  (recur j))))
            (let [q (int
                      (loop [p (unchecked-inc p)]
                        (if (aget sieve p)
                          p
                          (recur (unchecked-inc p)))))]
              (recur q))))))))

Here's a much more Clojury version. It's much slower and runs out of
memory past about 20 million, on a VM with 1GB -Xmx. That still sounds
like it beats your less than 1 million:

(defn primes-to [n]
  (loop [sieve (vec (take (quot (dec n) 2) (iterate #(+ % 2) 3))) i 0]
    (let [p (first (drop i (filter identity sieve)))
          indices (take-while
                    #(<= % (dec (quot n 2)))
                    (iterate #(+ % p) (dec (quot (* p p) 2))))]
      (if (empty? indices)
        (cons 2 (filter identity sieve))
        (recur
          (persistent!
            (reduce
              (fn [s index]
                (assoc! s index nil))
              (transient sieve)
              indices))
          (quot p 2))))))

The logic's fairly convoluted; I squeezed some performance and memory
out of it by storing only odd numbers in the vector (hence all the
(quot foo 2)s and the (cons 2 foo) on the line that generates the
return value); some more by using a transient vector for the inner
loop of the sieve. However it is functional instead of stateful,
unlike the version with the explicit mutable boolean array above.

There's no sorted map involved in either -- except to the extent that
a vector can be regarded as a sorted map with non-negative-integer
keys. Neither implementation explicitly uses the lazy-seq macro,
either.

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Reply via email to