chang hadoop version when import spark

2016-02-24 Thread YouPeng Yang
Hi I am developing an application based on spark-1.6. my lib dependencies is just as libraryDependencies ++= Seq( "org.apache.spark" %% "spark-core" % "1.6.0" ) it use hadoop 2.2.0 as the default hadoop version which not my preference.I want to change the hadoop versio when import spark .How

Re: question about the license of akka and Spark

2014-05-20 Thread YouPeng Yang
kka is under Apache 2 license too. > http://doc.akka.io/docs/akka/snapshot/project/licenses.html > > > On Tue, May 20, 2014 at 2:16 AM, YouPeng Yang > wrote: > >> Hi >> Just know akka is under a commercial license,however Spark is under the >> apach

question about the license of akka and Spark

2014-05-20 Thread YouPeng Yang
Hi Just know akka is under a commercial license,however Spark is under the apache license. Is there any problem? Regards

Re: how to set spark.executor.memory and heap size

2014-04-24 Thread YouPeng Yang
Hi I am also curious about this question. The textFile function was supposed to read a hdfs file? In this case ,It is on local filesystem that the file was taken in.There are any recognization ways to identify the local filesystem and the hdfs in the textFile function? Beside, the OOM exe

question about the SocketReceiver

2014-04-20 Thread YouPeng Yang
Hi I am studing the structure of the Spark Streaming(my spark version is 0.9.0). I have a question about the SocketReceiver.In the onStart function: --- protected def onStart() { logInfo("Connecting to " + host + ":" + port) val sock

Re: partitioning of small data sets

2014-04-15 Thread YouPeng Yang
Hi Actually,you can set the partition num by yourself to change the 'spark.default.parallelism' property .Otherwise,spark will use the default partition defaultParallelism. For Local Model,the defaultParallelism = totalcores. For Local Cluster Model,the defaultParallelism= math.max(totalcores

Re: Master registers itself at startup?

2014-04-13 Thread YouPeng Yang
Hi The 512MB is the default memory size which each executor needs. and actually, your job does not need as much as the default memory size. you can create a SparkContext with sc = new SparkContext("local-cluster[2,1,512]", "test") // suppose you use the local-cluster model. Here the 512 is the m