Would like to confirm that adding "-XX:+UseSerialGC" cured all the
symptoms. In fact Clojure version with loop/recur runs faster than Java
now. I will update the GitHub repo with all findings later.
Again, thanks for all points and hints. I feel enriched now.
Best regards,
AndyL
On Wed, Sep 16,
Thanks for the confirmation. I run "jstat" on both cases and it indicates a
lot of GC in Survivor0 bucket, specifically for "LazySeq+Gzip+2Threads" .
New area of learning for me though. That would explain spikes and JVM
pauses.
Best regards,
Andy
On Wed, Sep 16, 2015 at 9:41 AM, Gerrit Jansen van
I agree with you that the LazySeq+Gzip+2Threads combination causing a spike
is weird.
--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please
Gerrit,
Yes, I do compare apples to oranges, but my orange looks like a nice
Macintosh apple when run in a single thread :-). What I do not expect is
that lack of symmetry in a multiple thread situation and weird JVM CPU
spikes.
WRT to encoding - thanks for the hint - I removed ASCII one from Jav
one more thing although its unrelated to the performance differences seen:
The Character encoding specified in the Java code is US-ASCII while the
clojure reader uses UTF-8. Byte to Character encoding can make huge
differences in text processing apps see
http://java-performance.info/charset-encodi
Hi,
I do not think it has anything to do with thread sync or jit+gzip as a
matter of fact.
Why threads aren't the issue:
I've downloaded the code on my machine and the clojure code always run
slower no matter if I read one or two files, use gzip or not.
You run the test case using (future) an
Hi,
Thanks for looking into my questions. I posted a self contained example
here https://github.com/coreasync/parallel-gzip with instructions how to
create test data as well. Also attached results below I get on my quite
decent hardware (partial 'time' results are mangled, was not sure how to
sepa
I had the same question - are you running independent thread-isolated
lazy-seqs on different sources in different threads? Or are you creating
one lazy-seq and then *using* it to do different things in multiple threads?
In the first case, the synchronization in lazy-seq only happens in a
thread
Do you have a corresponding example of the parallel code? I'm not sure
which part(s) are being delegated to other threads.
Often it is just the I/O cost of reading the file that is the dominant
cost, so parallelism doesn't buy you much.
Alan
On Mon, Sep 14, 2015 at 9:10 PM, Andy L wrote:
> Hi