Dear Apostolos, Thanks for the response!
Our version is built on 2.1, the problem is that the state-of-the-art system I'm trying to compare is built on the version 1.2. So I have to deal with it. If I understand the level of parallelism correctly, --total-executor-cores is set to the number or workers multiplied by the executor core of each worker, in this case, 32 as well. I make use of the similar script in both the cases, so it shouldn't change. Thanks and regards, Jeevan K. Srivatsa On Wed, 29 Aug 2018 at 16:07, Apostolos N. Papadopoulos < papad...@csd.auth.gr> wrote: > Dear Jeevan, > > Spark 1.2 is quite old, and If I were you I would go for a newer version. > > However, is there a parallelism level (e.g., 20, 30) that works for both > installations? > > regards, > > Apostolos > > > > On 29/08/2018 04:55 μμ, jeevan.ks wrote: > > Hi, > > > > I've two systems. One is built on Spark 1.2 and the other on 2.1. I am > > benchmarking both with the same benchmarks (wordcount, grep, sort, etc.) > > with the same data set from S3 bucket (size ranges from 50MB to 10 GB). > The > > Spark cluster I made use of is r3.xlarge, 8 instances, 4 cores each, and > > 28GB RAM. I observed a strange behaviour while running the benchmarks > and is > > as follows: > > > > - When I ran Spark 1.2 version with default partition number > > (sc.defaultParallelism), the jobs would take forever to complete. So I > > changed it to the number of cores, i.e., 32 times 3 = 96. This did a > magic > > and the jobs completed quickly. > > > > - However, when I tried the above magic number on the version 2.1, the > jobs > > are taking forever. Deafult parallelism works better, but not that > > efficient. > > > > I'm having problem to rationalise this and compare both the systems. My > > question is: what changes were made from 1.2 to 2.1 with respect to > default > > parallelism for this behaviour to occur? How can I have both versions > behave > > similary on the same software/hardware configuration so that I can > compare? > > > > I'd really appreciate your help on this! > > > > Cheers, > > Jeevan > > > > > > > > -- > > Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ > > > > --------------------------------------------------------------------- > > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > > > > -- > Apostolos N. Papadopoulos, Associate Professor > Department of Informatics > Aristotle University of Thessaloniki > Thessaloniki, GREECE > tel: ++0030312310991918 > email: papad...@csd.auth.gr > twitter: @papadopoulos_ap > web: http://delab.csd.auth.gr/~apostol > > > --------------------------------------------------------------------- > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > >