How is Spark built on top of the Akka framework?
Hi all, I'd like to know, how is Spark built on top of the Akka framework? Are there information about that? Thanks in advance.
Re: How is Spark built on top of the Akka framework?
As of Spark 2.0 (not yet released), Spark does not use Akka any more. See https://issues.apache.org/jira/browse/SPARK-5293 On Fri, Jan 29, 2016 at 1:14 AM, Lorena Reis wrote: > Hi all, > > I'd like to know, how is Spark built on top of the Akka framework? Are > there information about that? > > Thanks in advance. >
Re: Adding Naive Bayes sample code in Documentation
JIRA created! https://issues.apache.org/jira/browse/SPARK-13089 Feel free to pick it up if you're interested. : ) Joseph On Wed, Jan 27, 2016 at 8:43 AM, Vinayak Agrawal wrote: > Hi, > I was reading through Spark ML package and I couldn't find Naive Bayes > examples documented on the spark documentation page. > http://spark.apache.org/docs/latest/ml-classification-regression.html > > However, the API exists and can be used. > > https://spark.apache.org/docs/1.5.2/api/python/pyspark.ml.html#module-pyspark.ml.classification > > Can the examples be added in the latest documentation? > > -- > Vinayak Agrawal > > > "To Strive, To Seek, To Find and Not to Yield!" > ~Lord Alfred Tennyson >
Re: Spark 2.0.0 release plan
I'm not an authoritative source but I think it is indeed the plan to move the default build to 2.11. See this discussion for more detail http://apache-spark-developers-list.1001551.n3.nabble.com/A-proposal-for-Spark-2-0-td15122.html On Fri, Jan 29, 2016 at 11:43 AM, Deenar Toraskar wrote: > A related question. Are the plans to move the default Spark builds to Scala > 2.11 with Spark 2.0? > > Regards > Deenar > > On 27 January 2016 at 19:55, Michael Armbrust > wrote: >> >> We do maintenance releases on demand when there is enough to justify doing >> one. I'm hoping to cut 1.6.1 soon, but have not had time yet. >> >> On Wed, Jan 27, 2016 at 8:12 AM, Daniel Siegmann >> wrote: >>> >>> Will there continue to be monthly releases on the 1.6.x branch during the >>> additional time for bug fixes and such? >>> >>> On Tue, Jan 26, 2016 at 11:28 PM, Koert Kuipers >>> wrote: thanks thats all i needed On Tue, Jan 26, 2016 at 6:19 PM, Sean Owen wrote: > > I think it will come significantly later -- or else we'd be at code > freeze for 2.x in a few days. I haven't heard anyone discuss this > officially but had batted around May or so instead informally in > conversation. Does anyone have a particularly strong opinion on that? > That's basically an extra 3 month period. > > https://cwiki.apache.org/confluence/display/SPARK/Wiki+Homepage > > On Tue, Jan 26, 2016 at 10:00 PM, Koert Kuipers > wrote: > > Is the idea that spark 2.0 comes out roughly 3 months after 1.6? So > > quarterly release as usual? > > Thanks >>> >> > - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: Spark 2.0.0 release plan
Its already underway: https://github.com/apache/spark/pull/10608 On Fri, Jan 29, 2016 at 11:50 AM, Jakob Odersky wrote: > I'm not an authoritative source but I think it is indeed the plan to > move the default build to 2.11. > > See this discussion for more detail > > http://apache-spark-developers-list.1001551.n3.nabble.com/A-proposal-for-Spark-2-0-td15122.html > > On Fri, Jan 29, 2016 at 11:43 AM, Deenar Toraskar > wrote: > > A related question. Are the plans to move the default Spark builds to > Scala > > 2.11 with Spark 2.0? > > > > Regards > > Deenar > > > > On 27 January 2016 at 19:55, Michael Armbrust > > wrote: > >> > >> We do maintenance releases on demand when there is enough to justify > doing > >> one. I'm hoping to cut 1.6.1 soon, but have not had time yet. > >> > >> On Wed, Jan 27, 2016 at 8:12 AM, Daniel Siegmann > >> wrote: > >>> > >>> Will there continue to be monthly releases on the 1.6.x branch during > the > >>> additional time for bug fixes and such? > >>> > >>> On Tue, Jan 26, 2016 at 11:28 PM, Koert Kuipers > >>> wrote: > > thanks thats all i needed > > On Tue, Jan 26, 2016 at 6:19 PM, Sean Owen > wrote: > > > > I think it will come significantly later -- or else we'd be at code > > freeze for 2.x in a few days. I haven't heard anyone discuss this > > officially but had batted around May or so instead informally in > > conversation. Does anyone have a particularly strong opinion on that? > > That's basically an extra 3 month period. > > > > https://cwiki.apache.org/confluence/display/SPARK/Wiki+Homepage > > > > On Tue, Jan 26, 2016 at 10:00 PM, Koert Kuipers > > wrote: > > > Is the idea that spark 2.0 comes out roughly 3 months after 1.6? So > > > quarterly release as usual? > > > Thanks > > > >>> > >> > > >
Re: Spark 2.0.0 release plan
https://github.com/apache/spark/pull/10608 On Fri, Jan 29, 2016 at 11:50 AM, Jakob Odersky wrote: > I'm not an authoritative source but I think it is indeed the plan to > move the default build to 2.11. > > See this discussion for more detail > > http://apache-spark-developers-list.1001551.n3.nabble.com/A-proposal-for-Spark-2-0-td15122.html > > On Fri, Jan 29, 2016 at 11:43 AM, Deenar Toraskar > wrote: > > A related question. Are the plans to move the default Spark builds to > Scala > > 2.11 with Spark 2.0? > > > > Regards > > Deenar > > > > On 27 January 2016 at 19:55, Michael Armbrust > > wrote: > >> > >> We do maintenance releases on demand when there is enough to justify > doing > >> one. I'm hoping to cut 1.6.1 soon, but have not had time yet. > >> > >> On Wed, Jan 27, 2016 at 8:12 AM, Daniel Siegmann > >> wrote: > >>> > >>> Will there continue to be monthly releases on the 1.6.x branch during > the > >>> additional time for bug fixes and such? > >>> > >>> On Tue, Jan 26, 2016 at 11:28 PM, Koert Kuipers > >>> wrote: > > thanks thats all i needed > > On Tue, Jan 26, 2016 at 6:19 PM, Sean Owen > wrote: > > > > I think it will come significantly later -- or else we'd be at code > > freeze for 2.x in a few days. I haven't heard anyone discuss this > > officially but had batted around May or so instead informally in > > conversation. Does anyone have a particularly strong opinion on that? > > That's basically an extra 3 month period. > > > > https://cwiki.apache.org/confluence/display/SPARK/Wiki+Homepage > > > > On Tue, Jan 26, 2016 at 10:00 PM, Koert Kuipers > > wrote: > > > Is the idea that spark 2.0 comes out roughly 3 months after 1.6? So > > > quarterly release as usual? > > > Thanks > > > >>> > >> > > > > - > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > >
Re: Spark 1.6.1
Hi Michael The Dataset aggregators do not appear to support complex Spark-SQL types. I wasn't sure if I was doing something wrong or if this was a bug or a feature not implemented yet. Having this in would be great. See below (reposting this from the spark user list) https://docs.cloud.databricks.com/docs/spark/1.6/index.html#examples/Dataset%20Aggregator.html I have been converting my UDAFs to Dataset (Dataset's are cool BTW) Aggregators. I have an ArraySum aggregator that does an element wise sum or arrays. I have got the simple version working, but the Generic version fails with the following error, not sure what I am doing wrong. scala> import sqlContext.implicits._ scala> def arraySum[I, N : Numeric : Encoder](f: I => N): TypedColumn[I, N] = new GenericArraySumAggregator(f).toColumn :34: error: Unable to find encoder for type stored in a Dataset. Primitive types (Int, String, etc) and Product types (case classes) are supported by importing sqlContext.implicits._ Support for serializing other types will be added in future releases. def arraySum[I, N : Numeric : Encoder](f: I => N): TypedColumn[I, N] = new GenericArraySumAggregator(f).toColumn ^ object ArraySumAggregator extends Aggregator[Seq[Float], Seq[Float], Seq[Float]] with Serializable { def zero: Seq[Float] = Nil // The initial value. def reduce(currentSum: Seq[Float], currentRow: Seq[Float]) = sumArray(currentSum, currentRow) def merge(sum: Seq[Float], row: Seq[Float]) = sumArray(sum, row) def finish(b: Seq[Float]) = b // Return the final result. def sumArray(a: Seq[Float], b: Seq[Float]): Seq[Float] = { (a, b) match { case (Nil, Nil) => Nil case (Nil, row) => row case (sum, Nil) => sum case (sum, row) => (a, b).zipped.map { case (a, b) => a + b } } } } class GenericArraySumAggregator[I, N : Numeric](f: I => N) extends Aggregator[Seq[I], Seq[N], Seq[N]] with Serializable { val numeric = implicitly[Numeric[N]] override def zero: Seq[N] = Nil override def reduce(b: Seq[N], a: Seq[I]): Seq[N] = sumArray(b, a.map( x => f(x))) //numeric.plus(b, f(a)) override def merge(b1: Seq[N],b2: Seq[N]): Seq[N] = sumArray(b1, b2) override def finish(reduction: Seq[N]): Seq[N] = reduction def sumArray(a: Seq[N], b: Seq[N]): Seq[N] = { (a, b) match { case (Nil, Nil) => Nil case (Nil, row) => row case (sum, Nil) => sum case (sum, row) => (a, b).zipped.map { case (a, b) => numeric.plus(a, b) } } } } Regards Deenar -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Spark-1-6-1-tp16009p16155.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com. - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: Spark 1.6.1
I think this is fixed in branch-1.6 already. If you can reproduce it there can you please open a JIRA and ping me? On Fri, Jan 29, 2016 at 12:16 PM, deenar < deenar.toras...@thinkreactive.co.uk> wrote: > Hi Michael > > The Dataset aggregators do not appear to support complex Spark-SQL types. I > wasn't sure if I was doing something wrong or if this was a bug or a > feature > not implemented yet. Having this in would be great. See below (reposting > this from the spark user list) > > > https://docs.cloud.databricks.com/docs/spark/1.6/index.html#examples/Dataset%20Aggregator.html > > I have been converting my UDAFs to Dataset (Dataset's are cool BTW) > Aggregators. I have an ArraySum aggregator that does an element wise sum or > arrays. I have got the simple version working, but the Generic version > fails > with the following error, not sure what I am doing wrong. > > scala> import sqlContext.implicits._ > > scala> def arraySum[I, N : Numeric : Encoder](f: I => N): TypedColumn[I, N] > = new GenericArraySumAggregator(f).toColumn > > :34: error: Unable to find encoder for type stored in a Dataset. > Primitive types (Int, String, etc) and Product types (case classes) are > supported by importing sqlContext.implicits._ Support for serializing > other > types will be added in future releases. > > def arraySum[I, N : Numeric : Encoder](f: I => N): TypedColumn[I, > N] = new GenericArraySumAggregator(f).toColumn > > > ^ > > object ArraySumAggregator extends Aggregator[Seq[Float], Seq[Float], > Seq[Float]] with Serializable { > def zero: Seq[Float] = Nil > // The initial value. > def reduce(currentSum: Seq[Float], currentRow: Seq[Float]) = > sumArray(currentSum, currentRow) > def merge(sum: Seq[Float], row: Seq[Float]) = sumArray(sum, row) > def finish(b: Seq[Float]) = b // Return the final result. > def sumArray(a: Seq[Float], b: Seq[Float]): Seq[Float] = { > (a, b) match { > case (Nil, Nil) => Nil > case (Nil, row) => row > case (sum, Nil) => sum > case (sum, row) => (a, b).zipped.map { case (a, b) => a + b } > } > } > } > class GenericArraySumAggregator[I, N : Numeric](f: I => N) extends > Aggregator[Seq[I], Seq[N], Seq[N]] with Serializable { > val numeric = implicitly[Numeric[N]] > override def zero: Seq[N] = Nil > override def reduce(b: Seq[N], a: Seq[I]): Seq[N] = sumArray(b, a.map( x > => f(x))) //numeric.plus(b, f(a)) > override def merge(b1: Seq[N],b2: Seq[N]): Seq[N] = sumArray(b1, b2) > override def finish(reduction: Seq[N]): Seq[N] = reduction > def sumArray(a: Seq[N], b: Seq[N]): Seq[N] = { > (a, b) match { > case (Nil, Nil) => Nil > case (Nil, row) => row > case (sum, Nil) => sum > case (sum, row) => (a, b).zipped.map { case (a, b) => numeric.plus(a, > b) } > } > } > } > > > > Regards > Deenar > > > > -- > View this message in context: > http://apache-spark-developers-list.1001551.n3.nabble.com/Spark-1-6-1-tp16009p16155.html > Sent from the Apache Spark Developers List mailing list archive at > Nabble.com. > > - > To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org > For additional commands, e-mail: dev-h...@spark.apache.org > >