If you really care about performance then I would use a macro for code
generation and do something the following:
(defmacro appendfunc
([m keys sb]
`(do
(.append ~sb (~m ~(first keys)))
~@(for [k (next keys)]
`(do
(.append ~sb \,)
(.append ~sb (~m ~k))))
(.append ~sb \n))))
(let [sb (StringBuilder.)
f (fn [^StringBuilder sb m] (appendfunc m [:one :two :three :four]
sb))
maps (repeat 250000 {:one 1 :two 2 :three 3 :four 4})]
(time (let [res (str (reduce f sb maps))] (count res))))
=> "Elapsed time: 118.105355 msecs"
Could probably optimise a bit more but that's under 400ns per row... pretty
decent I think.
On Thursday, 12 February 2015 09:25:12 UTC+8, Mark Watson wrote:
>
> I'm looking for the most performant way to transform a huge seq (size
> 250000) of maps into a single CSV.
>
> The data structure looks something like:
>
> (def data-struct
>
> (repeat 250000 {:one 1 :two 2 :three 3 :four 4}))
>
>
> A naive implementation would be:
>
> (let [f #(->> % (map (comp str val)) (clojure.string/join ","))]
>
> (->> data-struct
>
> (map f)
>
> (clojure.string/join "\n")))
>
>
> However, this takes far too long for my application (an the order of 10s
> of seconds).
>
> Another attempt using reducers:
>
> (require '[clojure.core.reducers :as r])
>
>
>
> (let [f #(->> % (map (comp str val)) (clojure.string/join ","))
>
> r-join (fn
>
> ([] nil)
>
> ([x y]
>
> (if (and x y) (str x "\n" y)
>
> (if x (str x)
>
> (if y (str y))))))]
>
> (->> data-struct
>
> (r/map f)
>
> (r/fold r-join)))
>
>
> Still not great.
>
> But, Looking at the sources of clojure.string/join and clojure.core/str,
> it becomes apparent that the both implementations create an instance of
> java.lang.StringBuilder
> for each element in the sequence. (I have to imagine this is the main
> issue, even though GC seems to only be ~5% of the runtime)
>
> Would it make sense to instantiate one java.lang.StringBuilder for all of
> the concatenation (and call java.lang.StringBuilder append)?
>
> What's the best way to do this with idiomatic Clojure?
>
> Thanks a lot!
>
--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to [email protected]
Note that posts from new members are moderated - please be patient with your
first post.
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to the Google Groups
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.