Re: [VOTE] Release Apache Spark 1.6.0 (RC1)

2015-12-03 Thread robineast
+1 OSX 10.10.5, java version "1.8.0_40", scala 2.10 mvn clean package -DskipTests [INFO] Spark Project External Kafka ... SUCCESS [ 18.161 s] [INFO] Spark Project Examples . SUCCESS [01:18 min] [INFO] Spark Project External Kafka Assembly .

Re: [VOTE] Release Apache Spark 1.5.1 (RC1)

2015-09-26 Thread robineast
+1 build/mvn clean package -DskipTests -Pyarn -Phadoop-2.6 OK Basic graph tests Load graph using edgeListFile...SUCCESS Run PageRank...SUCCESS Minimum Spanning Tree Algorithm Run basic Minimum Spanning Tree algorithm...SUCCESS Run Minimum Spanning Tree taxonomy creation...SUCCESS -- Vi

Re: RDD API patterns

2015-09-16 Thread robineast
I'm not sure the problem is quite as bad as you state. Both sampleByKey and sampleByKeyExact are implemented using a function from StratifiedSamplingUtils which does one of two things depending on whether the exact implementation is needed. The exact version requires double the number of lines of c

Re: [HELP] Spark 1.4.1 tasks take ridiculously long time to complete

2015-09-03 Thread robineast
I would suggest you move this to the Spark User list, this is the development list for discussion on development of Spark. It would help if you could give some more information about what you are trying to do e.g. what code you are running, how you submitted the job (spark-shell, spark-submit) and