subject:"Computing mean and standard deviation by key"

Re: Computing mean and standard deviation by key

2014-09-12 Thread rzykov

xt: http://apache-spark-user-list.1001560.n3.nabble.com/Computing-mean-and-standard-deviation-by-key-tp11192p14068.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user

Re: Computing mean and standard deviation by key

2014-09-12 Thread David Rowe

t value for parameter num: > Numeric[Iterable[Double]] > .values.stats > > > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Computing-mean-and-standard-deviation-by-key-tp11192p14

Re: Computing mean and standard deviation by key

2014-09-12 Thread Sean Owen

alue for parameter num: > Numeric[Iterable[Double]] > .values.stats > > > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Computing-mean-and-standard-deviation-by-key-tp11192p14065

Re: Computing mean and standard deviation by key

2014-09-12 Thread rzykov

t find implicit value for parameter num: Numeric[Iterable[Double]] .values.stats -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Computing-mean-and-standard-deviation-by-key-tp11192p14065.html Sent from the Apache Spark User List mailin

Re: Computing mean and standard deviation by key

2014-09-11 Thread David Rowe

culating mean and std dev for Paired RDDs (key, value)? > > Now I'm using an approach with ReduceByKey but want to make my code more > concise and readable. > > > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Comp

Re: Computing mean and standard deviation by key

2014-09-11 Thread rzykov

nd readable. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Computing-mean-and-standard-deviation-by-key-tp11192p14062.html Sent from the Apache Spark User List mailing list archive at Nabble.com. -

Re: Computing mean and standard deviation by key

2014-08-04 Thread Ron Gonzalez

es - sum * sum) / n print("stddev: " + stddev) stddev } I hope that helps -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Computing-mean-and-standard-deviation-by-key-tp11192p11334.html Sent from the Apache Spark User List maili

Re: Computing mean and standard deviation by key

2014-08-04 Thread kriskalish

} I hope that helps -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Computing-mean-and-standard-deviation-by-key-tp11192p11334.html Sent from the Apache Spark User List mailing list archive at Nabble.com. ---

Re: Computing mean and standard deviation by key

2014-08-01 Thread Ron Gonzalez

> <3 > > -Kris > > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Computing-mean-and-standard-deviation-by-key-tp11192p11214.html > Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: Computing mean and standard deviation by key

2014-08-01 Thread kriskalish

Thanks for the help everyone. I got the mapValues approach working. I will experiment with the reduceByKey approach later. <3 -Kris -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Computing-mean-and-standard-deviation-by-key-tp11192p11214.html Sent f

Re: Computing mean and standard deviation by key

2014-08-01 Thread Evan R. Sparks

gt; val iterable = x._2 > > var sum = 0.0 > > var count = 0 > > iterable.foreach{ y => > > sum = sum + y.foo > > count = count + 1 > > } > > val mean = sum/count; > > // save mean to database... > > } >

Re: Computing mean and standard deviation by key

2014-08-01 Thread Sean Owen

2 > var sum = 0.0 > var count = 0 > iterable.foreach{ y => > sum = sum + y.foo > count = count + 1 > } > val mean = sum/count; > // save mean to database... > } > > > > > -- > View this message in context: > http

Re: Computing mean and standard deviation by key

2014-08-01 Thread kriskalish

ontext: http://apache-spark-user-list.1001560.n3.nabble.com/Computing-mean-and-standard-deviation-by-key-tp11192p11207.html Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: Computing mean and standard deviation by key

2014-08-01 Thread Evan R. Sparks

// do fancy things with the mean and deviation >>> } >>> >>> However, there seems to be no way to convert the iterable into an RDD. Is >>> there some other technique for doing this? I'm to the point where I'm >>> considering copying and

Re: Computing mean and standard deviation by key

2014-08-01 Thread Xu (Simon) Chen

some other technique for doing this? I'm to the point where I'm >> considering copying and pasting the StatCollector class and changing the >> type from Double to MyClass (or making it generic). >> >> Am I going down the wrong path? >> >> >> >> -- >> View this message in context: >> http://apache-spark-user-list.1001560.n3.nabble.com/Computing-mean-and-standard-deviation-by-key-tp11192.html >> Sent from the Apache Spark User List mailing list archive at Nabble.com. >> > >

Re: Computing mean and standard deviation by key

2014-08-01 Thread Xu (Simon) Chen

nsidering copying and pasting the StatCollector class and changing the > type from Double to MyClass (or making it generic). > > Am I going down the wrong path? > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Computing-mean-and-standard-deviation-by-key-tp11192.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. >

Re: Computing mean and standard deviation by key

2014-08-01 Thread Sean Owen

You're certainly not iterating on the driver. The Iterable you process in your function is on the cluster and done in parallel. On Fri, Aug 1, 2014 at 8:36 PM, Kristopher Kalish wrote: > The reason I want an RDD is because I'm assuming that iterating the > individual elements of an RDD on the dri

Re: Computing mean and standard deviation by key

2014-08-01 Thread Kristopher Kalish

The reason I want an RDD is because I'm assuming that iterating the individual elements of an RDD on the driver of the cluster is much slower than coming up with the mean and standard deviation using a map-reduce-based algorithm. I don't know the intimate details of Spark's implementation, but it

Computing mean and standard deviation by key

2014-08-01 Thread kriskalish

Double to MyClass (or making it generic). Am I going down the wrong path? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Computing-mean-and-standard-deviation-by-key-tp11192.html Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: Computing mean and standard deviation by key

Re: Computing mean and standard deviation by key

Re: Computing mean and standard deviation by key

Re: Computing mean and standard deviation by key

Re: Computing mean and standard deviation by key

Re: Computing mean and standard deviation by key

Re: Computing mean and standard deviation by key

Re: Computing mean and standard deviation by key

Re: Computing mean and standard deviation by key

Re: Computing mean and standard deviation by key

Re: Computing mean and standard deviation by key

Re: Computing mean and standard deviation by key

Re: Computing mean and standard deviation by key

Re: Computing mean and standard deviation by key

Re: Computing mean and standard deviation by key

Re: Computing mean and standard deviation by key

Re: Computing mean and standard deviation by key

Re: Computing mean and standard deviation by key

Computing mean and standard deviation by key

19 matches

Site Navigation

Mail list logo

Footer information