On Dec 2, 2:50 pm, ataggart <alex.tagg...@gmail.com> wrote: > After reading the code, I'm inclined to not trust those numbers. Note > that the time metrics for test-split* are all in the same ballpark, > creating the same number of superfluous, intermediate String > instances, but the memory numbers you list are wildly different. How > are you collecting these numbers? Have you controlled for the GC > kicking in?
I'm not doing anything clever, just watching the RES column on top. If you have any good suggestions, I'm open to them. There is a qualitative difference between the runs, though. I can run test-split-3 five times in a row, all with similar times, without having the java process size get bigger than 0.6 GB. When I run any of the others, the size quickly balloons up to something more like 8.5 GB. As far as speed goes, the two faster functions (clustering upon re- runs around 170 +/- 10 s) are the ones that deal well with memory, while the slower ones (clustering around 190 +/- 10 s) use more memory. Upon re-testing, I had test-split-4 make the process size increase once. That I don't understand. But the others regularly balloon up to 8.5 GB. In any case, I would expect that if I were doing this right, the process size would stay small, since only a single line has to be held in memory at one time. Sure, I'm consing a lot of garbage, but it's all short-lived. And there's still the question of how to do this in parallel. Thanks, Johann -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en