We use clojure on hadoop by using the Cascading framework, it would be hard
to see the influence of clojure on performance because the code is
complicated.  But, that means clojure is used mostly to specify the Flow (a
DAG construct) that Cascading provides.  That's a way to use clojure that
doesn't have to put clojure in the data-path.  Cascalog takes the
abstraction a bit further.

Profiling would let you figure out the best bang for your performance buck.
 Also, check for use of reflection, that's usually a quick win.  There
might be something you can do to 20% of the code that will provide a 80%
speedup.

Young generation allocations shouldn't cause a full GC, that's the point of
having a young generation.

Uncontended atom swaps are going to be a bit

On Sun, Apr 28, 2013 at 9:41 AM, Ji Zhang <zhangj...@gmail.com> wrote:

> HI,
>
> I come with an update.
>
> Apart from the slow startup time issue, which I can resolve by resuing JVM
> or enlarge the splits, there's another factor that influence performance a
> lot, GC. It turns out Clojure will generate a lot of transient objects,
> which causes frequent young gc, over five times per second. On contrary,
> pure Java counterpart map-red job will gc only once per second.
>
> Also I found a very informative article on this particular issue, I
> believe many of you have read it:
>
> http://berlinbrowndev.blogspot.com/2009/07/jvm-notebook-basic-clojure-java-and-jvm.html
>
> So what I am asking is - is there any one use clojure-hadoop in
> production? How's the speed? GC frequency?
>
> I ran a simple map-red job to look for a substring  in input line (use
> .contains) and increment a counter (swap! cnt inc). The result is,
> processing 4GB data in one mapper, Clojure uses 5'37" and pure Java uses
> 4'42", it's one minute difference. Although the Clojure code is less and
> cooler, but the performance loss is not quite worthwhile, IMHO.
>
> So I hope here would be some realistic cases that say Clojure is suitable
> to writing hadoop job.
>
> Thanks.
>
> On Friday, April 26, 2013 6:05:33 PM UTC+8, Ji Zhang wrote:
>
>> Hi,
>>
>> I'm writing map-reduce job with Clojure, yet to find that it seems to be
>> much slower than a Jave job.
>>
>> So I write a simple test case, and upload to gist:
>> https://gist.github.com/**jizhang/5466149<https://gist.github.com/jizhang/5466149>
>>
>> At the end of code, there is execution outputs, here are some significant
>> stats:
>>
>> Average time taken by Map tasks: Java 7sec, Clojure 19sec
>> CPU time spent (ms): Java 244,000, Clojure 1,145,440
>>
>> I'm wondering what slows down the Clojure written map-reduce job. Am I
>> using it wrong, or it's just an inappropriate senario.
>>
>> Any thoughts will be great. Thanks!
>>
>> Jerry
>>
>  --
> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clojure@googlegroups.com
> Note that posts from new members are moderated - please be patient with
> your first post.
> To unsubscribe from this group, send email to
> clojure+unsubscr...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en
> ---
> You received this message because you are subscribed to the Google Groups
> "Clojure" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to clojure+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>
>

-- 
-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Reply via email to