I'm afraid I can't reproduce this error, Alexander.  I can run

    (write-lines "/tmp/out" (line-seq (reader "/tmp/bigfile")))

on a 4.5 GB file with no problem, and I don't have that much memory.

Out-of-memory errors like this usually occur when your code is
"holding on to the head" of the sequence.  For example, this will
fail:

    (def lines (line-seq (reader "/tmp/bigfile")))
    (write-lines "/tmp/out" lines)

because the "lines" var holds a reference to the first item in the
sequence, so the entire sequence gets cached in memory.

Another possibility is that the your big file doesn't have any line
breaks, or that it has extremely long lines.  In that case, you'll
have to increase the Java heap size or manually read the file in
smaller chunks.

-Stuart Sierra


On Jul 24, 10:28 am, Alexander Stoddard <alexander.stodd...@gmail.com>
wrote:
> I am a very new clojure user but I believe I have found a bug when
> using the clojure.contrib.duck-streams library.
>
> My attempt to stream process a very big file blows up with
> "java.lang.OutOfMemoryError: Java heap space".
>
> I can reproduce the problem with the following simple code which I
> think rules out most of my own (nearly unlimited) ignorance.
>
> (use '[clojure.contrib.duck-streams :only(reader write-lines)])
> (write-lines "test.out" (line-seq (reader "ReallyBigFile")))
>
> Can anyone enlighten my as to what might be going wrong and or suggest
> an alternative ?
>
> My original code looked like:
> (write-lines "test.out" (map my-line-processing-function (line-seq
> (reader "ReallyBigFile"))))
>
> Thank you and kind regards,
> Alex Stoddard
>
> Further details below:
>
> I am using clojure and clojure contrib built from the head of the git
> repository:
> richhickey-clojure-3e60eff602652e753a54ba88b25dbdd2615c3b2e
> richhickey-clojure-contrib-e20e8effe977640592b1f285d6c666492d74df00
>
> My java details are:
> java version "1.6.0_04"
> Java(TM) SE Runtime Environment (build 1.6.0_04-b12)
> Java HotSpot(TM) 64-Bit Server VM (build 10.0-b19, mixed mode)
>
> Stack trace:
>
> Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
> (test_read_write.clj:0)
>         at clojure.lang.Compiler.eval(Compiler.java:4617)
>         at clojure.lang.Compiler.load(Compiler.java:4931)
>         at clojure.lang.Compiler.loadFile(Compiler.java:4898)
>         at clojure.main$load_script__6637.invoke(main.clj:210)
>         at clojure.main$init_opt__6640.invoke(main.clj:215)
>         at clojure.main$initialize__6650.invoke(main.clj:243)
>         at clojure.main$null_opt__6672.invoke(main.clj:268)
>         at clojure.main$legacy_script__6687.invoke(main.clj:299)
>         at clojure.lang.Var.invoke(Var.java:359)
>         at clojure.main.legacy_script(main.java:32)
>         at clojure.lang.Script.main(Script.java:20)
> Caused by: java.lang.OutOfMemoryError: Java heap space
>         at java.util.Arrays.copyOf(Arrays.java:2882)
>         at 
> java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:100)
>         at 
> java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:515)
>         at java.lang.StringBuffer.append(StringBuffer.java:306)
>         at java.io.BufferedReader.readLine(BufferedReader.java:345)
>         at java.io.BufferedReader.readLine(BufferedReader.java:362)
>         at clojure.core$line_seq__4708$fn__4710.invoke(core.clj:1790)
>         at clojure.lang.LazySeq.sval(LazySeq.java:42)
>         at clojure.lang.LazySeq.seq(LazySeq.java:56)
>         at clojure.lang.LazySeq.first(LazySeq.java:78)
>         at clojure.lang.RT.first(RT.java:549)
>         at clojure.core$first__3817.invoke(core.clj:43)
>         at 
> clojure.contrib.duck_streams$write_lines__117.invoke(duck_streams.clj:221)
>         at user$eval__298.invoke(test_read_write.clj:3)
>         at clojure.lang.Compiler.eval(Compiler.java:4601)
>         ... 10 more
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to