Re: [VOTE] Release Apache Spark 2.0.1 (RC3)

2016-09-26 Thread Krishna Sankar
I do run both Python and Scala. But via iPython/Python2 with my own test code. Not running the tests from the distribution. Cheers On Mon, Sep 26, 2016 at 11:59 AM, Holden Karau wrote: > I'm seeing some test failures with Python 3 that could definitely be > environmental (going to rebuild my vi

Re: [VOTE] Release Apache Spark 2.0.0 (RC5)

2016-07-20 Thread Krishna Sankar
+1 (non-binding, of course) 1. Compiled OS X 10.11.5 (El Capitan) OK Total time: 24:07 min mvn clean package -Pyarn -Phadoop-2.7 -DskipTests 2. Tested pyspark, mllib (iPython 4.0) 2.0 Spark version is 2.0.0 2.1. statistics (min,max,mean,Pearson,Spearman) OK 2.2. Linear/Ridge/Lasso Regression

Re: [VOTE] Release Apache Spark 2.0.0 (RC4)

2016-07-15 Thread Krishna Sankar
+1 (non-binding, of course) 1. Compiled OS X 10.11.5 (El Capitan) OK Total time: 26:27 min mvn clean package -Pyarn -Phadoop-2.7 -DskipTests 2. Tested pyspark, mllib (iPython 4.0) 2.0 Spark version is 2.0.0 2.1. statistics (min,max,mean,Pearson,Spearman) OK 2.2. Linear/Ridge/Lasso Regression

Re: [VOTE] Release Apache Spark 2.0.0 (RC4)

2016-07-15 Thread Krishna Sankar
Can't find the "spark-assembly-2.0.0-hadoop2.7.0.jar" after compilation. Usually it is in the assembly/target/scala-2.11 Has the packaging changed for 2.0.0 ? Cheers On Thu, Jul 14, 2016 at 11:59 AM, Reynold Xin wrote: > Please vote on releasing the following candidate as Apache Spark version >

Re: [VOTE] Release Apache Spark 1.6.2 (RC2)

2016-06-22 Thread Krishna Sankar
+1 (non-binding, of course) 1. Compiled OSX 10.10 (Yosemite) OK Total time: 37:11 min mvn clean package -Pyarn -Phadoop-2.6 -DskipTests 2. Tested pyspark, mllib (iPython 4.0) 2.0 Spark version is 1.6.2 2.1. statistics (min,max,mean,Pearson,Spearman) OK 2.2. Linear/Ridge/Lasso Regression OK 2.

Thanks For a Job Well Done !!!

2016-06-18 Thread Krishna Sankar
Hi all, Just wanted to thank all for the dataset API - most of the times we see only bugs in these lists ;o). - Putting some context, this weekend I was updating the SQL chapters of my book - it had all the ugliness of SchemaRDD, registerTempTable, take(10).foreach(println) and take

Re: [vote] Apache Spark 2.0.0-preview release (rc1)

2016-05-18 Thread Krishna Sankar
+1. Looks Good. The mllib results are in line with 1.6.1. Deprecation messages. I will convert to ml and test later in the day. Also will try GraphX exercises for our Strata London Tutorial Quick Notes: 1. pyspark env variables need to be changed - IPYTHON and IPYTHON_OPTS are removed

Re: [GRAPHX] Graph Algorithms and Spark

2016-04-21 Thread Krishna Sankar
Hi, 1. Yep, GraphX is stable and would be a good choice for you to implement algorithms. For a quick intro you can refer to our Strata MLlib tutorial GraphX slides http://goo.gl/Ffq2Az 2. GraphX has implemented algorithms like PageRank & ConnectedComponents[1] 3. It also has prim

Re: [VOTE] Release Apache Spark 1.6.0 (RC4)

2015-12-25 Thread Krishna Sankar
+1 (non-binding, of course) 1. Compiled OSX 10.10 (Yosemite) OK Total time: 29:25 min mvn clean package -Pyarn -Phadoop-2.6 -DskipTests 2. Tested pyspark, mllib (iPython 4.0) 2.0 Spark version is 1.6.0 2.1. statistics (min,max,mean,Pearson,Spearman) OK 2.2. Linear/Ridge/Laso Regression OK 2.3

Re: [VOTE] Release Apache Spark 1.6.0 (RC3)

2015-12-17 Thread Krishna Sankar
+1 (non-binding, of course) 1. Compiled OSX 10.10 (Yosemite) OK Total time: 29:32 min mvn clean package -Pyarn -Phadoop-2.6 -DskipTests 2. Tested pyspark, mllib (iPython 4.0) 2.0 Spark version is 1.6.0 2.1. statistics (min,max,mean,Pearson,Spearman) OK 2.2. Linear/Ridge/Laso Regression OK 2.3

Re: [VOTE] Release Apache Spark 1.6.0 (RC2)

2015-12-14 Thread Krishna Sankar
Guys, The sc.version gives 1.6.0-SNAPSHOT. Need to change to 1.6.0. Can you pl verify ? Cheers On Sat, Dec 12, 2015 at 9:39 AM, Michael Armbrust wrote: > Please vote on releasing the following candidate as Apache Spark version > 1.6.0! > > The vote is open until Tuesday, December 15, 2015 at

Re: [VOTE] Release Apache Spark 1.5.2 (RC2)

2015-11-08 Thread Krishna Sankar
In addition to the wrong entry point, I suspect there is a cache problem as well. I have seen strange errors that disappear completely once the ivy cache is deleted. Cheers On Sun, Nov 8, 2015 at 7:54 PM, Ted Yu wrote: > Why did you directly jump to spark-streaming-mqtt module ? > > Can you dro

Re: [VOTE] Release Apache Spark 1.5.2 (RC2)

2015-11-06 Thread Krishna Sankar
+1 (non-binding, of course) (Hope I made it in time. ~T-20 !) 1. Compiled OSX 10.10 (Yosemite) OK Total time: 25:52 min mvn clean package -Pyarn -Phadoop-2.6 -DskipTests 2. Tested pyspark, mllib (iPython 4.0, FYI, notebook install is separate “conda install ipython” and then “conda install ju

Re: [VOTE] Release Apache Spark 1.5.2 (RC1)

2015-10-26 Thread Krishna Sankar
Guys, The sc.version returns 1.5.1 in python and scala. Is anyone getting the same results ? Probably I am doing something wrong. Cheers On Sun, Oct 25, 2015 at 12:07 AM, Reynold Xin wrote: > Please vote on releasing the following candidate as Apache Spark > version 1.5.2. The vote is open u

Re: [ANNOUNCE] Announcing Spark 1.5.1

2015-10-12 Thread Krishna Sankar
I think the key is to vote a specific set of source tarballs without any binary artifacts. The specific binaries are useful but shouldn't be part of the voting process. Makes sense, we really cannot prove (and no need to) that the binaries do not contain malware, but the source can be proven to be

Re: [VOTE] Release Apache Spark 1.5.1 (RC1)

2015-09-24 Thread Krishna Sankar
+1 (non-binding, of course) 1. Compiled OSX 10.10 (Yosemite) OK Total time: 26:48 min mvn clean package -Pyarn -Phadoop-2.6 -DskipTests 2. Tested pyspark, mllib (iPython 4.0, FYI, notebook install is separate “conda install python” and then “conda install jupyter”) 2.1. statistics (min,max,me

Re: [VOTE] Release Apache Spark 1.5.0 (RC3)

2015-09-04 Thread Krishna Sankar
ate the notebook to use builtin SQL function month and year, > instead of Python UDF? (they are introduced in 1.5). > > Once remove those two udfs, it runs successfully, also much faster. > > On Fri, Sep 4, 2015 at 2:22 PM, Krishna Sankar > wrote: > > Yin, > >It

Re: [VOTE] Release Apache Spark 1.5.0 (RC3)

2015-09-04 Thread Krishna Sankar
7:30 AM, Tom Graves wrote: >> >>> The upper/lower case thing is known. >>> https://issues.apache.org/jira/browse/SPARK-9550 >>> I assume it was decided to be ok and its going to be in the release >>> notes but Reynold or Josh can probably speak to it

Re: [VOTE] Release Apache Spark 1.5.0 (RC3)

2015-09-04 Thread Krishna Sankar
I assume it was decided to be ok and its going to be in the release notes > but Reynold or Josh can probably speak to it more. > > Tom > > > > On Thursday, September 3, 2015 10:21 PM, Krishna Sankar < > ksanka...@gmail.com> wrote: > > > +? > > 1.

Re: [VOTE] Release Apache Spark 1.5.0 (RC3)

2015-09-03 Thread Krishna Sankar
+? 1. Compiled OSX 10.10 (Yosemite) OK Total time: 26:09 min mvn clean package -Pyarn -Phadoop-2.6 -DskipTests 2. Tested pyspark, mllib 2.1. statistics (min,max,mean,Pearson,Spearman) OK 2.2. Linear/Ridge/Laso Regression OK 2.3. Decision Tree, Naive Bayes OK 2.4. KMeans OK Center And S

Re: [VOTE] Release Apache Spark 1.5.0 (RC2)

2015-08-27 Thread Krishna Sankar
+1 (non-binding, of course) 1. Compiled OSX 10.10 (Yosemite) OK Total time: 42:36 min mvn clean package -Pyarn -Phadoop-2.6 -DskipTests 2. Tested pyspark, mllib 2.1. statistics (min,max,mean,Pearson,Spearman) OK 2.2. Linear/Ridge/Laso Regression OK 2.3. Decision Tree, Naive Bayes OK 2.4. KMea

Re: [VOTE] Release Apache Spark 1.4.1 (RC4)

2015-07-09 Thread Krishna Sankar
+1 1. Compiled OSX 10.10 (Yosemite) OK Total time: 38:11 min mvn clean package -Pyarn -Phadoop-2.6 -DskipTests 2. Tested pyspark, mllib 2.1. statistics (min,max,mean,Pearson,Spearman) OK 2.2. Linear/Ridge/Laso Regression OK 2.3. Decision Tree, Naive Bayes OK 2.4. KMeans OK Center And S

Re: [VOTE] Release Apache Spark 1.4.1 (RC3)

2015-07-07 Thread Krishna Sankar
+1 (non-binding, of course) 1. Compiled OSX 10.10 (Yosemite) OK Total time: 27:24 min mvn clean package -Pyarn -Phadoop-2.6 -DskipTests 2. Tested pyspark, mllib 2.1. statistics (min,max,mean,Pearson,Spearman) OK 2.2. Linear/Ridge/Laso Regression OK 2.3. Decision Tree, Naive Bayes OK 2.4. KMea

Re: Can not build master

2015-07-03 Thread Krishna Sankar
Patrick, I assume an RC3 will be out for folks like me to test the distribution. As usual, I will run the tests when you have a new distribution. Cheers On Fri, Jul 3, 2015 at 4:38 PM, Patrick Wendell wrote: > Patch that added test-jar dependencies: > https://github.com/apache/spark/commit/b

Re: [VOTE] Release Apache Spark 1.4.1 (RC2)

2015-07-03 Thread Krishna Sankar
e built-in maven (i.e. build/mvn). It might be that > we require a newer version of maven than you have. The release itself > is built with maven 3.3.3: > > https://github.com/apache/spark/blob/master/build/mvn#L72 > > - Patrick > > On Fri, Jul 3, 2015 at 3:19 PM, K

Re: [VOTE] Release Apache Spark 1.4.1 (RC2)

2015-07-03 Thread Krishna Sankar
Yep, happens to me as well. Build loops. Cheers On Fri, Jul 3, 2015 at 2:40 PM, Ted Yu wrote: > Patrick: > I used the following command: > ~/apache-maven-3.3.1/bin/mvn -DskipTests -Phadoop-2.4 -Pyarn -Phive clean > package > > The build doesn't seem to stop. > Here is tail of build output: > >

Re: except vs subtract

2015-07-03 Thread Krishna Sankar
Thanks. Forgot about that ;o( On Thu, Jul 2, 2015 at 11:57 PM, Reynold Xin wrote: > "except" is a keyword in Python unfortunately. > > > > On Thu, Jul 2, 2015 at 11:54 PM, Krishna Sankar > wrote: > >> Guys, >>Scala says except while python has s

except vs subtract

2015-07-02 Thread Krishna Sankar
Guys, Scala says except while python has subtract. (I verified that except doesn't exist in python) Why the difference in syntax for the same functionality ? Cheers

Re: [VOTE] Release Apache Spark 1.4.1

2015-06-29 Thread Krishna Sankar
+1 (non-binding, of course) 1. Compiled OSX 10.10 (Yosemite) OK Total time: 13:26 min mvn clean package -Pyarn -Phadoop-2.6 -DskipTests 2. Tested pyspark, mllib 2.1. statistics (min,max,mean,Pearson,Spearman) OK 2.2. Linear/Ridge/Laso Regression OK 2.3. Decision Tree, Naive Bayes OK 2.4. KMea

Re: [VOTE] Release Apache Spark 1.4.1

2015-06-28 Thread Krishna Sankar
Patrick, Haven't seen any replies on test results. I will byte ;o) - Should I test this version or is another one in the wings ? Cheers On Tue, Jun 23, 2015 at 10:37 PM, Patrick Wendell wrote: > Please vote on releasing the following candidate as Apache Spark version > 1.4.1! > > This releas

Re: [VOTE] Release Apache Spark 1.4.0 (RC4)

2015-06-05 Thread Krishna Sankar
+1 (non-binding, of course) 1. Compiled OSX 10.10 (Yosemite) OK Total time: 25:42 min (My brand new shiny MacBookPro12,1 : 16GB. Inaugurated the machine with compile & test 1.4.0-RC4 !) mvn clean package -Pyarn -Dyarn.version=2.6.0 -Phadoop-2.4 -Dhadoop.version=2.6.0 -DskipTests 2. Tested pys

Re: [VOTE] Release Apache Spark 1.4.0 (RC3)

2015-05-30 Thread Krishna Sankar
+1 (non-binding, of course) 1. Compiled OSX 10.10 (Yosemite) OK Total time: 17:07 min mvn clean package -Pyarn -Dyarn.version=2.6.0 -Phadoop-2.4 -Dhadoop.version=2.6.0 -DskipTests 2. Tested pyspark, mlib - running as well as compare results with 1.3.1 2.1. statistics (min,max,mean,Pearson,Spe

Re: [VOTE] Release Apache Spark 1.4.0 (RC2)

2015-05-24 Thread Krishna Sankar
+1 (non-binding, of course) 1. Compiled OSX 10.10 (Yosemite) OK Total time: 16:52 min mvn clean package -Pyarn -Dyarn.version=2.6.0 -Phadoop-2.4 -Dhadoop.version=2.6.0 -DskipTests 2. Tested pyspark, mlib - running as well as compare results with 1.3.1 2.1. statistics (min,max,mean,Pearson,Spe

Re: [VOTE] Release Apache Spark 1.4.0 (RC1)

2015-05-19 Thread Krishna Sankar
Quick tests from my side - looks OK. The results are same or very similar to 1.3.1. Will add dataframes et al in future tests. +1 (non-binding, of course) 1. Compiled OSX 10.10 (Yosemite) OK Total time: 17:42 min mvn clean package -Pyarn -Dyarn.version=2.6.0 -Phadoop-2.4 -Dhadoop.version=2.6

Re: [VOTE] Release Apache Spark 1.3.1 (RC3)

2015-04-11 Thread Krishna Sankar
+1. All tests OK (same as RC2) Cheers On Fri, Apr 10, 2015 at 11:05 PM, Patrick Wendell wrote: > Please vote on releasing the following candidate as Apache Spark version > 1.3.1! > > The tag to be voted on is v1.3.1-rc2 (commit 3e83913): > > https://git-wip-us.apache.org/repos/asf?p=spark.git;a

Re: [VOTE] Release Apache Spark 1.3.1 (RC2)

2015-04-08 Thread Krishna Sankar
+1 (non-binding, of course) 1. Compiled OSX 10.10 (Yosemite) OK Total time: 14:16 min mvn clean package -Pyarn -Dyarn.version=2.6.0 -Phadoop-2.4 -Dhadoop.version=2.6.0 -Phive -DskipTests -Dscala-2.11 2. Tested pyspark, mlib - running as well as compare results with 1.3.0 pyspark works well

Re: [VOTE] Release Apache Spark 1.2.2

2015-04-06 Thread Krishna Sankar
+1 On Sun, Apr 5, 2015 at 4:24 PM, Patrick Wendell wrote: > Please vote on releasing the following candidate as Apache Spark version > 1.2.2! > > The tag to be voted on is v1.2.2-rc1 (commit 7531b50): > > https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=7531b50e406ee2e3301b009ceea7

Re: [VOTE] Release Apache Spark 1.3.1

2015-04-04 Thread Krishna Sankar
+1 (non-binding, of course) 1. Compiled OSX 10.10 (Yosemite) OK Total time: 15:04 min mvn clean package -Pyarn -Dyarn.version=2.6.0 -Phadoop-2.4 -Dhadoop.version=2.6.0 -Phive -DskipTests -Dscala-2.11 2. Tested pyspark, mlib - running as well as compare results with 1.3.0 pyspark works well

Re: [VOTE] Release Apache Spark 1.3.0 (RC3)

2015-03-09 Thread Krishna Sankar
Excellent, Thanks Xiangrui. The mystery is solved. Cheers On Mon, Mar 9, 2015 at 3:30 PM, Xiangrui Meng wrote: > Krishna, I tested your linear regression example. For linear > regression, we changed its objective function from 1/n * \|A x - > b\|_2^2 to 1/(2n) * \|Ax - b\|_2^2 to be consistent

Re: [VOTE] Release Apache Spark 1.3.0 (RC3)

2015-03-08 Thread Krishna Sankar
Yep, otherwise this will become an N^2 problem - Scala versions X Hadoop Distributions X ... May be one option is to have a minimum basic set (which I know is what we are discussing) and move the rest to spark-packages.org. There the vendors can add the latest downloads - for example when 1.4 is r

Re: [VOTE] Release Apache Spark 1.3.0 (RC3)

2015-03-06 Thread Krishna Sankar
+1 (non-binding, of course) 1. Compiled OSX 10.10 (Yosemite) OK Total time: 13:55 min mvn clean package -Pyarn -Dyarn.version=2.6.0 -Phadoop-2.4 -Dhadoop.version=2.6.0 -Phive -DskipTests -Dscala-2.11 2. Tested pyspark, mlib - running as well as compare results with 1.1.x & 1.2.x pyspark wo

Re: [VOTE] Release Apache Spark 1.3.0 (RC2)

2015-03-04 Thread Krishna Sankar
015 at 11:15 PM, Krishna Sankar > wrote: > > +1 (non-binding, of course) > > > > 1. Compiled OSX 10.10 (Yosemite) OK Total time: 13:53 min > > mvn clean package -Pyarn -Dyarn.version=2.6.0 -Phadoop-2.4 > > -Dhadoop.version=2.6.0 -Phive -DskipTests -Dscala-2.11 &

Re: [VOTE] Release Apache Spark 1.3.0 (RC2)

2015-03-03 Thread Krishna Sankar
+1 (non-binding, of course) 1. Compiled OSX 10.10 (Yosemite) OK Total time: 13:53 min mvn clean package -Pyarn -Dyarn.version=2.6.0 -Phadoop-2.4 -Dhadoop.version=2.6.0 -Phive -DskipTests -Dscala-2.11 2. Tested pyspark, mlib - running as well as compare results with 1.1.x & 1.2.x 2.1. statisti

Re: [VOTE] Release Apache Spark 1.3.0 (RC1)

2015-02-19 Thread Krishna Sankar
Excellent. Explicit toDF() works. a) employees.toDF().registerTempTable("Employees") - works b) Also affects saveAsParquetFile - orders.toDF().saveAsParquetFile Adding to my earlier tests: 4.0 SQL from Scala and Python 4.1 result = sqlContext.sql("SELECT * from Employees WHERE State = 'WA'") OK 4.

Re: [VOTE] Release Apache Spark 1.3.0 (RC1)

2015-02-18 Thread Krishna Sankar
+1 (non-binding, of course) 1. Compiled OSX 10.10 (Yosemite) OK Total time: 14:50 min mvn clean package -Pyarn -Dyarn.version=2.6.0 -Phadoop-2.4 -Dhadoop.version=2.6.0 -Phive -DskipTests -Dscala-2.11 2. Tested pyspark, mlib - running as well as compare results with 1.1.x & 1.2.x 2.1. statisti

Re: [VOTE] Release Apache Spark 1.2.1 (RC3)

2015-02-02 Thread Krishna Sankar
+1 (non-binding, of course) 1. Compiled OSX 10.10 (Yosemite) OK Total time: 11:13 min mvn clean package -Pyarn -Dyarn.version=2.6.0 -Phadoop-2.4 -Dhadoop.version=2.6.0 -Phive -DskipTests -Dscala-2.11 2. Tested pyspark, mlib - running as well as compare results with 1.1.x & 1.2.0 2.1. statisti

Re: [VOTE] Release Apache Spark 1.2.1 (RC2)

2015-01-28 Thread Krishna Sankar
+1 (non-binding, of course) 1. Compiled OSX 10.10 (Yosemite) OK Total time: 12:22 min mvn clean package -Pyarn -Dyarn.version=2.6.0 -Phadoop-2.4 -Dhadoop.version=2.6.0 -Phive -DskipTests 2. Tested pyspark, mlib - running as well as compare results with 1.1.x & 1.2.0 2.1. statistics (min,max,m

Re: [VOTE] Release Apache Spark 1.2.1 (RC1)

2015-01-27 Thread Krishna Sankar
+1 1. Compiled OSX 10.10 (Yosemite) OK Total time: 12:55 min mvn clean package -Pyarn -Dyarn.version=2.6.0 -Phadoop-2.4 -Dhadoop.version=2.6.0 -Phive -DskipTests 2. Tested pyspark, mlib - running as well as compare results with 1.1.x & 1.2.0 2.1. statistics OK 2.2. Linear/Ridge/Laso Regression

Fwd: [VOTE] Release Apache Spark 1.2.0 (RC2)

2014-12-17 Thread Krishna Sankar
Forgot Reply To All ;o( -- Forwarded message -- From: Krishna Sankar Date: Wed, Dec 10, 2014 at 9:16 PM Subject: Re: [VOTE] Release Apache Spark 1.2.0 (RC2) To: Matei Zaharia +1 Works same as RC1 1. Compiled OSX 10.10 (Yosemite) mvn -Pyarn -Phadoop-2.4 -Dhadoop.version=2.4.0

Re: [VOTE] Release Apache Spark 1.2.0 (RC1)

2014-12-04 Thread Krishna Sankar
On Sun, Nov 30, 2014 at 6:49 AM, Krishna Sankar > wrote: > > +1 > > 1. Compiled OSX 10.10 (Yosemite) mvn -Pyarn -Phadoop-2.4 > > -Dhadoop.version=2.4.0 -DskipTests clean package 16:46 min (slightly > slower > > connection) > > 2. Tested pyspark, mlib - running

Re: [VOTE] Release Apache Spark 1.2.0 (RC1)

2014-11-29 Thread Krishna Sankar
+1 1. Compiled OSX 10.10 (Yosemite) mvn -Pyarn -Phadoop-2.4 -Dhadoop.version=2.4.0 -DskipTests clean package 16:46 min (slightly slower connection) 2. Tested pyspark, mlib - running as well as compare esults with 1.1.x 2.1. statistics OK 2.2. Linear/Ridge/Laso Regression OK Slight difference

Re: [VOTE] Release Apache Spark 1.2.0 (RC1)

2014-11-28 Thread Krishna Sankar
Looks like the documentation hasn't caught up with the new features. On the machine learning side, for example org.apache.spark.ml, RandomForest, gbtree and so forth. Is a refresh of the documentation planned ? Am happy to see these capabilities, but these would need good explanations as well, espe

Re: [VOTE] Release Apache Spark 1.1.1 (RC2)

2014-11-19 Thread Krishna Sankar
+1 1. Compiled OSX 10.10 (Yosemite) mvn -Pyarn -Phadoop-2.4 -Dhadoop.version=2.4.0 -DskipTests clean package 10:49 min 2. Tested pyspark, mlib 2.1. statistics OK 2.2. Linear/Ridge/Laso Regression OK 2.3. Decision Tree, Naive Bayes OK 2.4. KMeans OK 2.5. rdd operations OK 2.6. recommendation OK 2.7.

Re: [VOTE] Release Apache Spark 1.1.1 (RC1)

2014-11-13 Thread Krishna Sankar
+1 1. Compiled OSX 10.10 (Yosemite) mvn -Pyarn -Phadoop-2.4 -Dhadoop.version=2.4.0 -DskipTests clean package 10:49 min 2. Tested pyspark, mlib 2.1. statistics OK 2.2. Linear/Ridge/Laso Regression OK 2.3. Decision Tree, Naive Bayes OK 2.4. KMeans OK 2.5. rdd operations OK 2.6. recommendation OK 2.7.

Re: Breaking the previous large-scale sort record with Spark

2014-10-13 Thread Krishna Sankar
Well done guys. MapReduce sort at that time was a good feat and Spark now has raised the bar with the ability to sort a PB. Like some of the folks in the list, a summary of what worked (and didn't) as well as the monitoring practices would be good. Cheers P.S: What are you folks planning next ? O

Re: [VOTE] Release Apache Spark 1.0.1 (RC2)

2014-07-05 Thread Krishna Sankar
+1 - Compiled rc2 w/ CentOS 6.5, Yarn,Hadoop 2.2.0 - successful - Smoke Test (scala,python) (distributed cluster) - successful - We had ran Java/SparkSQL (count, distinct et al) ~250M records RDD over HBase 0.98.3 over last build (rc1) - successful - Stand alone multi-node cluster i

Re: [VOTE] Release Apache Spark 1.0.1 (RC1)

2014-06-27 Thread Krishna Sankar
+1 Compiled for CentOS 6.5, deployed in our 4 node cluster (Hadoop 2.2, YARN) Smoke Tests (sparkPi,spark-shell, web UI) successful Cheers On Thu, Jun 26, 2014 at 7:06 PM, Patrick Wendell wrote: > Please vote on releasing the following candidate as Apache Spark version > 1.0.1! > > The tag to

Re: Contributing Spark Infrastructure Configuration Docs

2014-06-05 Thread Krishna Sankar
Stephen, We are working thru Dell configurations; would be happy to review your diagrams and offer feedback from our experience. Let me know the URLs. Cheers On Thu, Jun 5, 2014 at 2:51 PM, Stephen Watt wrote: > Hi Folks > > My name is Steve Watt and I work in the CTO Office at Red Hat. I'

Re: [VOTE] Release Apache Spark 1.0.0 (RC11)

2014-05-28 Thread Krishna Sankar
+1 Pulled & built on MacOS X, EC2 Amazon Linux Ran test programs on OS X, 5 node c3.4xlarge cluster Cheers On Wed, May 28, 2014 at 7:36 PM, Andy Konwinski wrote: > +1 > On May 28, 2014 7:05 PM, "Xiangrui Meng" wrote: > > > +1 > > > > Tested apps with standalone client mode and yarn cluster and