SparkPi is just an example, so its performance doesn't really matter.
Simpler is better.
Kryo could be an issue but that would be a change in Kryo.

On Fri, Nov 25, 2016 at 7:30 AM Prasun Ratn <prasun.r...@gmail.com> wrote:

> Hi,
>
> I am seeing perf degradation in the Spark/Pi example on a single-node
> setup (using local[K])
>
> Using 1, 2, 4, and 8 cores, this is the execution time in seconds for
> the same number of iterations:-
> Random: 4.0, 7.0, 12.96, 17.96
>
> If I change the code to use ThreadLocalRandom
> (
> https://github.com/apache/spark/blob/master/examples/src/main/scala/org/apache/spark/examples/SparkPi.scala#L35
> )
> it scales properly:-
> ThreadLocalRandom: 2.2, 1.4, 1.07, 1.00
>
> I see a similar issue in Kryo serializer in another app - the push
> function shows up at the top of profile data, but goes away completely
> if I use ThreadLocalRandom
>
>
> https://github.com/EsotericSoftware/kryo/blob/master/src/com/esotericsoftware/kryo/util/ObjectMap.java#L259
>
> The JDK documentation
> (
> https://docs.oracle.com/javase/7/docs/api/java/util/concurrent/ThreadLocalRandom.html
> )
> says:
>
> > When applicable, use of ThreadLocalRandom? rather than shared Random
> objects in concurrent programs will typically encounter much less overhead
> and contention. Use of ThreadLocalRandom? is particularly appropriate when
> multiple tasks (for example, each a ForkJoinTask? ) use random numbers in
> parallel in thread pools
>
> I am using Spark 1.5 and Java 1.8.0_91.
>
> Is there any reason to prefer Random over ThreadLocalRandom?
>
> Thanks
> Prasun
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>

Reply via email to