I need to process large binary files, i.e. to remove ^M characters. Let's assume files are about 50MB - small enough to be processed in memory (but not with a naive implementation).
The following code works, except it throws OutOfMemoryError for file as small as 6MB: (defn read-bin-file [file] (to-byte-array (as-file file))) (defn remove-cr-from-file [file] (let [dirty-bytes (read-bin-file file) clean-bytes (filter #(not (= 13 %)) dirty-bytes) changed? (< (count clean-bytes) (alength dirty- bytes))] ; OutOfMemoryError (if changed? (write-bin-file file clean-bytes) ; writing works fine nil))) How to force 'filter' to be efficient, i.e. create another array instead of a memory-blowing list? How to approach processing large binary data in Clojure? -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en