On Tue, Mar 3, 2015 at 11:15 PM, Krishna Sankar wrote:
> +1 (non-binding, of course)
>
> 1. Compiled OSX 10.10 (Yosemite) OK Total time: 13:53 min
> mvn clean package -Pyarn -Dyarn.version=2.6.0 -Phadoop-2.4
> -Dhadoop.version=2.6.0 -Phive -DskipTests -Dscala-2.11
> 2. Tested pyspark, mlib -
+1 (non-binding, of course)
1. Compiled OSX 10.10 (Yosemite) OK Total time: 13:53 min
mvn clean package -Pyarn -Dyarn.version=2.6.0 -Phadoop-2.4
-Dhadoop.version=2.6.0 -Phive -DskipTests -Dscala-2.11
2. Tested pyspark, mlib - running as well as compare results with 1.1.x &
1.2.x
2.1. statisti
Can someone show me a code snippet on how I can create one SparkContext and
share it across multiple Unit Test files? I want the tests to run in
parallel as well. (i.e. parallelExecution in Test := true)
I looked up SharedSparkContext, doesnt seem to work when tests are run in
parallel. Can someo
Please vote on releasing the following candidate as Apache Spark version 1.3.0!
The tag to be voted on is v1.3.0-rc2 (commit 3af2687):
https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=3af26870e5163438868c4eb2df88380a533bb232
The release files, including signatures, digests, etc. can
This vote is cancelled in favor of RC2.
On Thu, Feb 26, 2015 at 9:50 AM, Sandor Van Wassenhove
wrote:
> FWIW, I tested the first rc and saw no regressions. I ran our benchmarks
> built against spark 1.3 and saw results consistent with spark 1.2/1.2.1.
>
> On 2/25/15, 5:51 PM, "Patrick Wendell" w
Hi Robert,
There's some work to do LDA via Gibbs sampling in this JIRA:
https://issues.apache.org/jira/browse/SPARK-1405 as well as this one:
https://issues.apache.org/jira/browse/SPARK-5556
It may make sense to have a more general Gibbs sampling framework, but it
might be good to have a few desi
Hi,
I have some ideas for MLlib that I think might be of general interest
so I'd like to see what people think and maybe find some collaborators.
(1) Some form of Markov chain Monte Carlo such as Gibbs sampling
or Metropolis-Hastings. Any kind of Monte Carlo method is readily
parallelized so Spar
BTW, is anybody on this list going to the London Meetup in a few weeks?
https://skillsmatter.com/meetups/6987-apache-spark-living-the-post-mapreduce-world#community
Would be nice to meet other people working on the guts of Spark! :-)
Xiangrui Meng writes:
> Hey Alexander,
>
> I don't quite un
Hi,
I want to start a Spark standalone cluster programatically in java.
I have been checking these classes,
- org.apache.spark.deploy.master.Master
- org.apache.spark.deploy.worker.Worker
I successfully started a master with this simple main class.
public static void main(String[] args) {