On Dec 2, 2:50 pm, ataggart <alex.tagg...@gmail.com> wrote:

> After reading the code, I'm inclined to not trust those numbers.  Note
> that the time metrics for test-split* are all in the same ballpark,
> creating the same number of superfluous, intermediate String
> instances, but the memory numbers you list are wildly different.  How
> are you collecting these numbers?  Have you controlled for the GC
> kicking in?

I'm not doing anything clever, just watching the RES column on top. If
you have any good suggestions, I'm open to them.

There is a qualitative difference between the runs, though. I can run
test-split-3 five times in a row, all with similar times, without
having the java process size get bigger than 0.6 GB. When I run any of
the others, the size quickly balloons up to something more like 8.5
GB.

As far as speed goes, the two faster functions (clustering upon re-
runs around 170 +/- 10 s) are the ones that deal well with memory,
while the slower ones (clustering around 190 +/- 10 s) use more
memory.

Upon re-testing, I had test-split-4 make the process size increase
once. That I don't understand. But the others regularly balloon up to
8.5 GB.

In any case, I would expect that if I were doing this right, the
process size would stay small, since only a single line has to be held
in memory at one time. Sure, I'm consing a lot of garbage, but it's
all short-lived. And there's still the question of how to do this in
parallel.

Thanks,
Johann

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Reply via email to