Re: Constraint Solver for Spark

2014-06-05 Thread Xiangrui Meng
I don't quite understand why putting linear constraints can promote orthogonality. For the interfaces, if the subproblem is determined by Y^T Y and Y^T b for each iteration, then the least squares solver, the non-negative least squares solver, or your convex solver is simply a function (A, b) -> x

Re: Constraint Solver for Spark

2014-06-05 Thread Debasish Das
Hi Xiangrui, For orthogonality properties in the factors we need a constraint solver other than the usuals (l1, upper and lower bounds, l2 etc) The interface of constraint solver is standard and I can add it in mllib optimization But I am not sure how will I call the gpl licensed ipm solver

Re: Constraint Solver for Spark

2014-06-05 Thread Xiangrui Meng
Hi Deb, Why do you want to make those methods public? If you only need to replace the solver for subproblems. You can try to make the solver pluggable. Now it supports least squares and non-negative least squares. You can define an interface for the subproblem solvers and maintain the IPM solver a

Re: Contributing Spark Infrastructure Configuration Docs

2014-06-05 Thread Andrew Ash
I would appreciate seeing the specs you came up with as well but don't need to particularly quickly. I'll wait until seeing the PR to comment on the specifics, but have some questions about the thought process that went into configuring the hardware. Is the idea to see how you spec'd out memory/d

Re: Contributing Spark Infrastructure Configuration Docs

2014-06-05 Thread Krishna Sankar
Stephen, We are working thru Dell configurations; would be happy to review your diagrams and offer feedback from our experience. Let me know the URLs. Cheers On Thu, Jun 5, 2014 at 2:51 PM, Stephen Watt wrote: > Hi Folks > > My name is Steve Watt and I work in the CTO Office at Red Hat. I'

Constraint Solver for Spark

2014-06-05 Thread Debasish Das
Hi, We are adding a constrained ALS solver in Spark to solve matrix factorization use-cases which needs additional constraints (bounds, equality, inequality, quadratic constraints) We are using a native version of a primal dual SOCP solver due to its small memory footprint and sparse ccs matrix c

Cannot use pyspark to aggregate on remote EC2 cluster

2014-06-05 Thread kriskalish
I'm in a situation where I have two compute nodes in Amazon EC2 and a third node that is used to just execute queries. The third node is not part of the cluster. It's also configured slightly differently. That is, the third node runs Ubuntu 14.04 while the two cluster nodes run CentOS. I launch p

Contributing Spark Infrastructure Configuration Docs

2014-06-05 Thread Stephen Watt
Hi Folks My name is Steve Watt and I work in the CTO Office at Red Hat. I've recently spent quite a bit of time designing single rack and multi-rack infrastructures for Spark for our own hardware procurement at Red Hat and I thought the diagrams and server specs for both Dell and HP would be us

Cassandra Examples Don't Work

2014-06-05 Thread Tim Kellogg
Hi, I’ve tried running the CassandraTest example against several versions of Cassandra and I can’t get it to work. I’m wondering if I’m doing something wrong, or if they simply don’t work. Please help! http://stackoverflow.com/q/24069039/503826 Much Thanks! Tim Kellogg Sr. Software Engineer,

Re: Building Spark against Scala 2.10.1 virtualized

2014-06-05 Thread Matei Zaharia
You can modify project/SparkBuild.scala and build Spark with sbt instead of Maven. On Jun 5, 2014, at 12:36 PM, Meisam Fathi wrote: > Hi community, > > How should I change sbt to compile spark core with a different version > of Scala? I see maven pom files define dependencies to scala 2.10.4.

Re: Timestamp support in v1.0

2014-06-05 Thread Michael Armbrust
Awesome, thanks for testing! On Thu, Jun 5, 2014 at 1:30 PM, dataginjaninja wrote: > I can confirm that the patch fixed my issue. :-) > > > > - > Cheers, > > Stephanie > -- > View this message in context: > http://apache-spark-developers-list.1001551.n3.nabble.com/Timestamp-support-in-v1-0-

Re: Timestamp support in v1.0

2014-06-05 Thread dataginjaninja
I can confirm that the patch fixed my issue. :-) - Cheers, Stephanie -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Timestamp-support-in-v1-0-tp6850p6948.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

Building Spark against Scala 2.10.1 virtualized

2014-06-05 Thread Meisam Fathi
Hi community, How should I change sbt to compile spark core with a different version of Scala? I see maven pom files define dependencies to scala 2.10.4. I need to override/ignore the maven dependencies and use Scala virtualized, which needs these lines in a build.sbt file: scalaOrganization := "

Re: Implementing rdd.scanLeft()

2014-06-05 Thread Andrew Ash
I that something that documentation on the method can solve? On Thu, Jun 5, 2014 at 10:47 AM, Reynold Xin wrote: > I think the main concern is this would require scanning the data twice, and > maybe the user should be aware of it ... > > > On Thu, Jun 5, 2014 at 10:29 AM, Andrew Ash wrote: > >

Re: Implementing rdd.scanLeft()

2014-06-05 Thread Reynold Xin
I think the main concern is this would require scanning the data twice, and maybe the user should be aware of it ... On Thu, Jun 5, 2014 at 10:29 AM, Andrew Ash wrote: > I have a use case that would greatly benefit from RDDs having a .scanLeft() > method. Are the project developers interested

Implementing rdd.scanLeft()

2014-06-05 Thread Andrew Ash
I have a use case that would greatly benefit from RDDs having a .scanLeft() method. Are the project developers interested in adding this to the public API? Looking through past message traffic, this has come up a few times. The recommendation from the list before has been to implement a paralle