How is Spark built on top of the Akka framework?

2016-01-29 Thread Lorena Reis
Hi all,

I'd like to know, how is Spark built on top of the Akka framework? Are
there information about that?

Thanks in advance.


Re: How is Spark built on top of the Akka framework?

2016-01-29 Thread Reynold Xin
As of Spark 2.0 (not yet released), Spark does not use Akka any more.

See https://issues.apache.org/jira/browse/SPARK-5293

On Fri, Jan 29, 2016 at 1:14 AM, Lorena Reis  wrote:

> Hi all,
>
> I'd like to know, how is Spark built on top of the Akka framework? Are
> there information about that?
>
> Thanks in advance.
>


Re: Adding Naive Bayes sample code in Documentation

2016-01-29 Thread Joseph Bradley
JIRA created!  https://issues.apache.org/jira/browse/SPARK-13089
Feel free to pick it up if you're interested.  : )
Joseph

On Wed, Jan 27, 2016 at 8:43 AM, Vinayak Agrawal  wrote:

> Hi,
> I was reading through Spark ML package and I couldn't find Naive Bayes
> examples documented on the spark documentation page.
> http://spark.apache.org/docs/latest/ml-classification-regression.html
>
> However, the API exists and can be used.
>
> https://spark.apache.org/docs/1.5.2/api/python/pyspark.ml.html#module-pyspark.ml.classification
>
> Can the examples be added in the latest documentation?
>
> --
> Vinayak Agrawal
>
>
> "To Strive, To Seek, To Find and Not to Yield!"
> ~Lord Alfred Tennyson
>


Re: Spark 2.0.0 release plan

2016-01-29 Thread Jakob Odersky
I'm not an authoritative source but I think it is indeed the plan to
move the default build to 2.11.

See this discussion for more detail
http://apache-spark-developers-list.1001551.n3.nabble.com/A-proposal-for-Spark-2-0-td15122.html

On Fri, Jan 29, 2016 at 11:43 AM, Deenar Toraskar
 wrote:
> A related question. Are the plans to move the default Spark builds to Scala
> 2.11 with Spark 2.0?
>
> Regards
> Deenar
>
> On 27 January 2016 at 19:55, Michael Armbrust 
> wrote:
>>
>> We do maintenance releases on demand when there is enough to justify doing
>> one.  I'm hoping to cut 1.6.1 soon, but have not had time yet.
>>
>> On Wed, Jan 27, 2016 at 8:12 AM, Daniel Siegmann
>>  wrote:
>>>
>>> Will there continue to be monthly releases on the 1.6.x branch during the
>>> additional time for bug fixes and such?
>>>
>>> On Tue, Jan 26, 2016 at 11:28 PM, Koert Kuipers 
>>> wrote:

 thanks thats all i needed

 On Tue, Jan 26, 2016 at 6:19 PM, Sean Owen  wrote:
>
> I think it will come significantly later -- or else we'd be at code
> freeze for 2.x in a few days. I haven't heard anyone discuss this
> officially but had batted around May or so instead informally in
> conversation. Does anyone have a particularly strong opinion on that?
> That's basically an extra 3 month period.
>
> https://cwiki.apache.org/confluence/display/SPARK/Wiki+Homepage
>
> On Tue, Jan 26, 2016 at 10:00 PM, Koert Kuipers 
> wrote:
> > Is the idea that spark 2.0 comes out roughly 3 months after 1.6? So
> > quarterly release as usual?
> > Thanks


>>>
>>
>

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: Spark 2.0.0 release plan

2016-01-29 Thread Michael Armbrust
Its already underway: https://github.com/apache/spark/pull/10608

On Fri, Jan 29, 2016 at 11:50 AM, Jakob Odersky  wrote:

> I'm not an authoritative source but I think it is indeed the plan to
> move the default build to 2.11.
>
> See this discussion for more detail
>
> http://apache-spark-developers-list.1001551.n3.nabble.com/A-proposal-for-Spark-2-0-td15122.html
>
> On Fri, Jan 29, 2016 at 11:43 AM, Deenar Toraskar
>  wrote:
> > A related question. Are the plans to move the default Spark builds to
> Scala
> > 2.11 with Spark 2.0?
> >
> > Regards
> > Deenar
> >
> > On 27 January 2016 at 19:55, Michael Armbrust 
> > wrote:
> >>
> >> We do maintenance releases on demand when there is enough to justify
> doing
> >> one.  I'm hoping to cut 1.6.1 soon, but have not had time yet.
> >>
> >> On Wed, Jan 27, 2016 at 8:12 AM, Daniel Siegmann
> >>  wrote:
> >>>
> >>> Will there continue to be monthly releases on the 1.6.x branch during
> the
> >>> additional time for bug fixes and such?
> >>>
> >>> On Tue, Jan 26, 2016 at 11:28 PM, Koert Kuipers 
> >>> wrote:
> 
>  thanks thats all i needed
> 
>  On Tue, Jan 26, 2016 at 6:19 PM, Sean Owen 
> wrote:
> >
> > I think it will come significantly later -- or else we'd be at code
> > freeze for 2.x in a few days. I haven't heard anyone discuss this
> > officially but had batted around May or so instead informally in
> > conversation. Does anyone have a particularly strong opinion on that?
> > That's basically an extra 3 month period.
> >
> > https://cwiki.apache.org/confluence/display/SPARK/Wiki+Homepage
> >
> > On Tue, Jan 26, 2016 at 10:00 PM, Koert Kuipers 
> > wrote:
> > > Is the idea that spark 2.0 comes out roughly 3 months after 1.6? So
> > > quarterly release as usual?
> > > Thanks
> 
> 
> >>>
> >>
> >
>


Re: Spark 2.0.0 release plan

2016-01-29 Thread Mark Hamstra
https://github.com/apache/spark/pull/10608

On Fri, Jan 29, 2016 at 11:50 AM, Jakob Odersky  wrote:

> I'm not an authoritative source but I think it is indeed the plan to
> move the default build to 2.11.
>
> See this discussion for more detail
>
> http://apache-spark-developers-list.1001551.n3.nabble.com/A-proposal-for-Spark-2-0-td15122.html
>
> On Fri, Jan 29, 2016 at 11:43 AM, Deenar Toraskar
>  wrote:
> > A related question. Are the plans to move the default Spark builds to
> Scala
> > 2.11 with Spark 2.0?
> >
> > Regards
> > Deenar
> >
> > On 27 January 2016 at 19:55, Michael Armbrust 
> > wrote:
> >>
> >> We do maintenance releases on demand when there is enough to justify
> doing
> >> one.  I'm hoping to cut 1.6.1 soon, but have not had time yet.
> >>
> >> On Wed, Jan 27, 2016 at 8:12 AM, Daniel Siegmann
> >>  wrote:
> >>>
> >>> Will there continue to be monthly releases on the 1.6.x branch during
> the
> >>> additional time for bug fixes and such?
> >>>
> >>> On Tue, Jan 26, 2016 at 11:28 PM, Koert Kuipers 
> >>> wrote:
> 
>  thanks thats all i needed
> 
>  On Tue, Jan 26, 2016 at 6:19 PM, Sean Owen 
> wrote:
> >
> > I think it will come significantly later -- or else we'd be at code
> > freeze for 2.x in a few days. I haven't heard anyone discuss this
> > officially but had batted around May or so instead informally in
> > conversation. Does anyone have a particularly strong opinion on that?
> > That's basically an extra 3 month period.
> >
> > https://cwiki.apache.org/confluence/display/SPARK/Wiki+Homepage
> >
> > On Tue, Jan 26, 2016 at 10:00 PM, Koert Kuipers 
> > wrote:
> > > Is the idea that spark 2.0 comes out roughly 3 months after 1.6? So
> > > quarterly release as usual?
> > > Thanks
> 
> 
> >>>
> >>
> >
>
> -
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>


Re: Spark 1.6.1

2016-01-29 Thread deenar
Hi Michael

The Dataset aggregators do not appear to support complex Spark-SQL types. I
wasn't sure if I was doing something wrong or if this was a bug or a feature
not implemented yet. Having this in would be great. See below (reposting
this from the spark user list)

https://docs.cloud.databricks.com/docs/spark/1.6/index.html#examples/Dataset%20Aggregator.html

I have been converting my UDAFs to Dataset (Dataset's are cool BTW)
Aggregators. I have an ArraySum aggregator that does an element wise sum or
arrays. I have got the simple version working, but the Generic version fails
with the following error, not sure what I am doing wrong.

scala> import sqlContext.implicits._

scala> def arraySum[I, N : Numeric : Encoder](f: I => N): TypedColumn[I, N]
= new GenericArraySumAggregator(f).toColumn

:34: error: Unable to find encoder for type stored in a Dataset. 
Primitive types (Int, String, etc) and Product types (case classes) are
supported by importing sqlContext.implicits._  Support for serializing other
types will be added in future releases.

 def arraySum[I, N : Numeric : Encoder](f: I => N): TypedColumn[I,
N] = new GenericArraySumAggregator(f).toColumn



^

object ArraySumAggregator extends  Aggregator[Seq[Float], Seq[Float],
Seq[Float]] with Serializable {
  def zero: Seq[Float] = Nil
  // The initial value.
  def reduce(currentSum: Seq[Float], currentRow: Seq[Float]) =
sumArray(currentSum, currentRow)
  def merge(sum: Seq[Float], row: Seq[Float]) = sumArray(sum, row)
  def finish(b: Seq[Float]) = b // Return the final result.
  def sumArray(a: Seq[Float], b: Seq[Float]): Seq[Float] = {
(a, b) match {
  case (Nil, Nil) => Nil
  case (Nil, row) => row
  case (sum, Nil) => sum
  case (sum, row) => (a, b).zipped.map { case (a, b) => a + b }
}
  }
}
class GenericArraySumAggregator[I, N : Numeric](f: I => N) extends
Aggregator[Seq[I], Seq[N], Seq[N]] with Serializable {
  val numeric = implicitly[Numeric[N]]
  override def zero: Seq[N] = Nil
  override def reduce(b: Seq[N], a: Seq[I]): Seq[N] = sumArray(b, a.map( x
=> f(x))) //numeric.plus(b, f(a))
  override def merge(b1: Seq[N],b2: Seq[N]): Seq[N] = sumArray(b1, b2)
  override def finish(reduction: Seq[N]): Seq[N] = reduction
  def sumArray(a: Seq[N], b: Seq[N]): Seq[N] = {
(a, b) match {
  case (Nil, Nil) => Nil
  case (Nil, row) => row
  case (sum, Nil) => sum
  case (sum, row) => (a, b).zipped.map { case (a, b) => numeric.plus(a,
b) }
}
  }
}



Regards
Deenar



--
View this message in context: 
http://apache-spark-developers-list.1001551.n3.nabble.com/Spark-1-6-1-tp16009p16155.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: Spark 1.6.1

2016-01-29 Thread Michael Armbrust
I think this is fixed in branch-1.6 already.  If you can reproduce it there
can you please open a JIRA and ping me?

On Fri, Jan 29, 2016 at 12:16 PM, deenar <
deenar.toras...@thinkreactive.co.uk> wrote:

> Hi Michael
>
> The Dataset aggregators do not appear to support complex Spark-SQL types. I
> wasn't sure if I was doing something wrong or if this was a bug or a
> feature
> not implemented yet. Having this in would be great. See below (reposting
> this from the spark user list)
>
>
> https://docs.cloud.databricks.com/docs/spark/1.6/index.html#examples/Dataset%20Aggregator.html
>
> I have been converting my UDAFs to Dataset (Dataset's are cool BTW)
> Aggregators. I have an ArraySum aggregator that does an element wise sum or
> arrays. I have got the simple version working, but the Generic version
> fails
> with the following error, not sure what I am doing wrong.
>
> scala> import sqlContext.implicits._
>
> scala> def arraySum[I, N : Numeric : Encoder](f: I => N): TypedColumn[I, N]
> = new GenericArraySumAggregator(f).toColumn
>
> :34: error: Unable to find encoder for type stored in a Dataset.
> Primitive types (Int, String, etc) and Product types (case classes) are
> supported by importing sqlContext.implicits._  Support for serializing
> other
> types will be added in future releases.
>
>  def arraySum[I, N : Numeric : Encoder](f: I => N): TypedColumn[I,
> N] = new GenericArraySumAggregator(f).toColumn
>
>
> ^
>
> object ArraySumAggregator extends  Aggregator[Seq[Float], Seq[Float],
> Seq[Float]] with Serializable {
>   def zero: Seq[Float] = Nil
>   // The initial value.
>   def reduce(currentSum: Seq[Float], currentRow: Seq[Float]) =
> sumArray(currentSum, currentRow)
>   def merge(sum: Seq[Float], row: Seq[Float]) = sumArray(sum, row)
>   def finish(b: Seq[Float]) = b // Return the final result.
>   def sumArray(a: Seq[Float], b: Seq[Float]): Seq[Float] = {
> (a, b) match {
>   case (Nil, Nil) => Nil
>   case (Nil, row) => row
>   case (sum, Nil) => sum
>   case (sum, row) => (a, b).zipped.map { case (a, b) => a + b }
> }
>   }
> }
> class GenericArraySumAggregator[I, N : Numeric](f: I => N) extends
> Aggregator[Seq[I], Seq[N], Seq[N]] with Serializable {
>   val numeric = implicitly[Numeric[N]]
>   override def zero: Seq[N] = Nil
>   override def reduce(b: Seq[N], a: Seq[I]): Seq[N] = sumArray(b, a.map( x
> => f(x))) //numeric.plus(b, f(a))
>   override def merge(b1: Seq[N],b2: Seq[N]): Seq[N] = sumArray(b1, b2)
>   override def finish(reduction: Seq[N]): Seq[N] = reduction
>   def sumArray(a: Seq[N], b: Seq[N]): Seq[N] = {
> (a, b) match {
>   case (Nil, Nil) => Nil
>   case (Nil, row) => row
>   case (sum, Nil) => sum
>   case (sum, row) => (a, b).zipped.map { case (a, b) => numeric.plus(a,
> b) }
> }
>   }
> }
>
>
>
> Regards
> Deenar
>
>
>
> --
> View this message in context:
> http://apache-spark-developers-list.1001551.n3.nabble.com/Spark-1-6-1-tp16009p16155.html
> Sent from the Apache Spark Developers List mailing list archive at
> Nabble.com.
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
> For additional commands, e-mail: dev-h...@spark.apache.org
>
>