Hi xiefeng, Spark Context initialization takes some time and the tool does not really shine for small data computations: http://aadrake.com/command-line-tools-can-be-235x-faster-than-your-hadoop-cluster.html
But, when working with terabytes (petabytes) of data, those 35 seconds of initialization don't really matter. Regards, -- Bedrytski Aliaksandr sp...@bedryt.ski On Wed, Aug 31, 2016, at 11:45, xiefeng wrote: > I install a spark standalone and run the spark cluster(one master and one > worker) in a windows 2008 server with 16cores and 24GB memory. > > I have done a simple test: Just create a string RDD and simply return > it. I > use JMeter to test throughput but the highest is around 35/sec. I think > spark is powerful at distribute calculation, but why the throughput is so > limit in such simple test scenario only contains simple task dispatch and > no > calculation? > > 1. In JMeter I test both 10 threads or 100 threads, there is little > difference around 2-3/sec. > 2. I test both cache/not cache the RDD, there is little difference. > 3. During the test, the cpu and memory are in low level. > > Below is my test code: > @RestController > public class SimpleTest { > @RequestMapping(value = "/SimpleTest", method = RequestMethod.GET) > @ResponseBody > public String testProcessTransaction() { > return SparkShardTest.simpleRDDTest(); > } > } > > final static Map<String, JavaRDD<String>> simpleRDDs = > initSimpleRDDs(); > public static Map<String, JavaRDD<String>> initSimpleRDDs() > { > Map<String, JavaRDD<String>> result = new > ConcurrentHashMap<String, > JavaRDD<String>>(); > JavaRDD<String> rddData = JavaSC.parallelize(data); > rddData.cache().count(); //this cache will improve 1-2/sec > result.put("MyRDD", rddData); > return result; > } > > public static String simpleRDDTest() > { > JavaRDD<String> rddData = simpleRDDs.get("MyRDD"); > return rddData.first(); > } > > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Why-does-spark-take-so-much-time-for-simple-task-without-calculation-tp27628.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org