I do run both Python and Scala. But via iPython/Python2 with my own test
code. Not running the tests from the distribution.
Cheers
On Mon, Sep 26, 2016 at 11:59 AM, Holden Karau wrote:
> I'm seeing some test failures with Python 3 that could definitely be
> environmental (going to rebuild my vi
+1 (non-binding, of course)
1. Compiled OS X 10.11.5 (El Capitan) OK Total time: 24:07 min
mvn clean package -Pyarn -Phadoop-2.7 -DskipTests
2. Tested pyspark, mllib (iPython 4.0)
2.0 Spark version is 2.0.0
2.1. statistics (min,max,mean,Pearson,Spearman) OK
2.2. Linear/Ridge/Lasso Regression
+1 (non-binding, of course)
1. Compiled OS X 10.11.5 (El Capitan) OK Total time: 26:27 min
mvn clean package -Pyarn -Phadoop-2.7 -DskipTests
2. Tested pyspark, mllib (iPython 4.0)
2.0 Spark version is 2.0.0
2.1. statistics (min,max,mean,Pearson,Spearman) OK
2.2. Linear/Ridge/Lasso Regression
Can't find the "spark-assembly-2.0.0-hadoop2.7.0.jar" after compilation.
Usually it is in the assembly/target/scala-2.11
Has the packaging changed for 2.0.0 ?
Cheers
On Thu, Jul 14, 2016 at 11:59 AM, Reynold Xin wrote:
> Please vote on releasing the following candidate as Apache Spark version
>
+1 (non-binding, of course)
1. Compiled OSX 10.10 (Yosemite) OK Total time: 37:11 min
mvn clean package -Pyarn -Phadoop-2.6 -DskipTests
2. Tested pyspark, mllib (iPython 4.0)
2.0 Spark version is 1.6.2
2.1. statistics (min,max,mean,Pearson,Spearman) OK
2.2. Linear/Ridge/Lasso Regression OK
2.
Hi all,
Just wanted to thank all for the dataset API - most of the times we see
only bugs in these lists ;o).
- Putting some context, this weekend I was updating the SQL chapters of
my book - it had all the ugliness of SchemaRDD,
registerTempTable, take(10).foreach(println)
and take
+1. Looks Good.
The mllib results are in line with 1.6.1. Deprecation messages. I will
convert to ml and test later in the day.
Also will try GraphX exercises for our Strata London Tutorial
Quick Notes:
1. pyspark env variables need to be changed
- IPYTHON and IPYTHON_OPTS are removed
Hi,
1. Yep, GraphX is stable and would be a good choice for you to implement
algorithms. For a quick intro you can refer to our Strata MLlib tutorial
GraphX slides http://goo.gl/Ffq2Az
2. GraphX has implemented algorithms like PageRank &
ConnectedComponents[1]
3. It also has prim
+1 (non-binding, of course)
1. Compiled OSX 10.10 (Yosemite) OK Total time: 29:25 min
mvn clean package -Pyarn -Phadoop-2.6 -DskipTests
2. Tested pyspark, mllib (iPython 4.0)
2.0 Spark version is 1.6.0
2.1. statistics (min,max,mean,Pearson,Spearman) OK
2.2. Linear/Ridge/Laso Regression OK
2.3
+1 (non-binding, of course)
1. Compiled OSX 10.10 (Yosemite) OK Total time: 29:32 min
mvn clean package -Pyarn -Phadoop-2.6 -DskipTests
2. Tested pyspark, mllib (iPython 4.0)
2.0 Spark version is 1.6.0
2.1. statistics (min,max,mean,Pearson,Spearman) OK
2.2. Linear/Ridge/Laso Regression OK
2.3
Guys,
The sc.version gives 1.6.0-SNAPSHOT. Need to change to 1.6.0. Can you pl
verify ?
Cheers
On Sat, Dec 12, 2015 at 9:39 AM, Michael Armbrust
wrote:
> Please vote on releasing the following candidate as Apache Spark version
> 1.6.0!
>
> The vote is open until Tuesday, December 15, 2015 at
In addition to the wrong entry point, I suspect there is a cache problem as
well. I have seen strange errors that disappear completely once the ivy
cache is deleted.
Cheers
On Sun, Nov 8, 2015 at 7:54 PM, Ted Yu wrote:
> Why did you directly jump to spark-streaming-mqtt module ?
>
> Can you dro
+1 (non-binding, of course) (Hope I made it in time. ~T-20 !)
1. Compiled OSX 10.10 (Yosemite) OK Total time: 25:52 min
mvn clean package -Pyarn -Phadoop-2.6 -DskipTests
2. Tested pyspark, mllib (iPython 4.0, FYI, notebook install is separate
“conda install ipython” and then “conda install ju
Guys,
The sc.version returns 1.5.1 in python and scala. Is anyone getting the
same results ? Probably I am doing something wrong.
Cheers
On Sun, Oct 25, 2015 at 12:07 AM, Reynold Xin wrote:
> Please vote on releasing the following candidate as Apache Spark
> version 1.5.2. The vote is open u
I think the key is to vote a specific set of source tarballs without any
binary artifacts. The specific binaries are useful but shouldn't be part of
the voting process. Makes sense, we really cannot prove (and no need to)
that the binaries do not contain malware, but the source can be proven to
be
+1 (non-binding, of course)
1. Compiled OSX 10.10 (Yosemite) OK Total time: 26:48 min
mvn clean package -Pyarn -Phadoop-2.6 -DskipTests
2. Tested pyspark, mllib (iPython 4.0, FYI, notebook install is separate
“conda install python” and then “conda install jupyter”)
2.1. statistics (min,max,me
ate the notebook to use builtin SQL function month and year,
> instead of Python UDF? (they are introduced in 1.5).
>
> Once remove those two udfs, it runs successfully, also much faster.
>
> On Fri, Sep 4, 2015 at 2:22 PM, Krishna Sankar
> wrote:
> > Yin,
> >It
7:30 AM, Tom Graves wrote:
>>
>>> The upper/lower case thing is known.
>>> https://issues.apache.org/jira/browse/SPARK-9550
>>> I assume it was decided to be ok and its going to be in the release
>>> notes but Reynold or Josh can probably speak to it
I assume it was decided to be ok and its going to be in the release notes
> but Reynold or Josh can probably speak to it more.
>
> Tom
>
>
>
> On Thursday, September 3, 2015 10:21 PM, Krishna Sankar <
> ksanka...@gmail.com> wrote:
>
>
> +?
>
> 1.
+?
1. Compiled OSX 10.10 (Yosemite) OK Total time: 26:09 min
mvn clean package -Pyarn -Phadoop-2.6 -DskipTests
2. Tested pyspark, mllib
2.1. statistics (min,max,mean,Pearson,Spearman) OK
2.2. Linear/Ridge/Laso Regression OK
2.3. Decision Tree, Naive Bayes OK
2.4. KMeans OK
Center And S
+1 (non-binding, of course)
1. Compiled OSX 10.10 (Yosemite) OK Total time: 42:36 min
mvn clean package -Pyarn -Phadoop-2.6 -DskipTests
2. Tested pyspark, mllib
2.1. statistics (min,max,mean,Pearson,Spearman) OK
2.2. Linear/Ridge/Laso Regression OK
2.3. Decision Tree, Naive Bayes OK
2.4. KMea
+1
1. Compiled OSX 10.10 (Yosemite) OK Total time: 38:11 min
mvn clean package -Pyarn -Phadoop-2.6 -DskipTests
2. Tested pyspark, mllib
2.1. statistics (min,max,mean,Pearson,Spearman) OK
2.2. Linear/Ridge/Laso Regression OK
2.3. Decision Tree, Naive Bayes OK
2.4. KMeans OK
Center And S
+1 (non-binding, of course)
1. Compiled OSX 10.10 (Yosemite) OK Total time: 27:24 min
mvn clean package -Pyarn -Phadoop-2.6 -DskipTests
2. Tested pyspark, mllib
2.1. statistics (min,max,mean,Pearson,Spearman) OK
2.2. Linear/Ridge/Laso Regression OK
2.3. Decision Tree, Naive Bayes OK
2.4. KMea
Patrick,
I assume an RC3 will be out for folks like me to test the distribution.
As usual, I will run the tests when you have a new distribution.
Cheers
On Fri, Jul 3, 2015 at 4:38 PM, Patrick Wendell wrote:
> Patch that added test-jar dependencies:
> https://github.com/apache/spark/commit/b
e built-in maven (i.e. build/mvn). It might be that
> we require a newer version of maven than you have. The release itself
> is built with maven 3.3.3:
>
> https://github.com/apache/spark/blob/master/build/mvn#L72
>
> - Patrick
>
> On Fri, Jul 3, 2015 at 3:19 PM, K
Yep, happens to me as well. Build loops.
Cheers
On Fri, Jul 3, 2015 at 2:40 PM, Ted Yu wrote:
> Patrick:
> I used the following command:
> ~/apache-maven-3.3.1/bin/mvn -DskipTests -Phadoop-2.4 -Pyarn -Phive clean
> package
>
> The build doesn't seem to stop.
> Here is tail of build output:
>
>
Thanks. Forgot about that ;o(
On Thu, Jul 2, 2015 at 11:57 PM, Reynold Xin wrote:
> "except" is a keyword in Python unfortunately.
>
>
>
> On Thu, Jul 2, 2015 at 11:54 PM, Krishna Sankar
> wrote:
>
>> Guys,
>>Scala says except while python has s
Guys,
Scala says except while python has subtract. (I verified that except
doesn't exist in python) Why the difference in syntax for the same
functionality ?
Cheers
+1 (non-binding, of course)
1. Compiled OSX 10.10 (Yosemite) OK Total time: 13:26 min
mvn clean package -Pyarn -Phadoop-2.6 -DskipTests
2. Tested pyspark, mllib
2.1. statistics (min,max,mean,Pearson,Spearman) OK
2.2. Linear/Ridge/Laso Regression OK
2.3. Decision Tree, Naive Bayes OK
2.4. KMea
Patrick,
Haven't seen any replies on test results. I will byte ;o) - Should I
test this version or is another one in the wings ?
Cheers
On Tue, Jun 23, 2015 at 10:37 PM, Patrick Wendell
wrote:
> Please vote on releasing the following candidate as Apache Spark version
> 1.4.1!
>
> This releas
+1 (non-binding, of course)
1. Compiled OSX 10.10 (Yosemite) OK Total time: 25:42 min (My brand new
shiny MacBookPro12,1 : 16GB. Inaugurated the machine with compile & test
1.4.0-RC4 !)
mvn clean package -Pyarn -Dyarn.version=2.6.0 -Phadoop-2.4
-Dhadoop.version=2.6.0 -DskipTests
2. Tested pys
+1 (non-binding, of course)
1. Compiled OSX 10.10 (Yosemite) OK Total time: 17:07 min
mvn clean package -Pyarn -Dyarn.version=2.6.0 -Phadoop-2.4
-Dhadoop.version=2.6.0 -DskipTests
2. Tested pyspark, mlib - running as well as compare results with 1.3.1
2.1. statistics (min,max,mean,Pearson,Spe
+1 (non-binding, of course)
1. Compiled OSX 10.10 (Yosemite) OK Total time: 16:52 min
mvn clean package -Pyarn -Dyarn.version=2.6.0 -Phadoop-2.4
-Dhadoop.version=2.6.0 -DskipTests
2. Tested pyspark, mlib - running as well as compare results with 1.3.1
2.1. statistics (min,max,mean,Pearson,Spe
Quick tests from my side - looks OK. The results are same or very similar
to 1.3.1. Will add dataframes et al in future tests.
+1 (non-binding, of course)
1. Compiled OSX 10.10 (Yosemite) OK Total time: 17:42 min
mvn clean package -Pyarn -Dyarn.version=2.6.0 -Phadoop-2.4
-Dhadoop.version=2.6
+1. All tests OK (same as RC2)
Cheers
On Fri, Apr 10, 2015 at 11:05 PM, Patrick Wendell
wrote:
> Please vote on releasing the following candidate as Apache Spark version
> 1.3.1!
>
> The tag to be voted on is v1.3.1-rc2 (commit 3e83913):
>
> https://git-wip-us.apache.org/repos/asf?p=spark.git;a
+1 (non-binding, of course)
1. Compiled OSX 10.10 (Yosemite) OK Total time: 14:16 min
mvn clean package -Pyarn -Dyarn.version=2.6.0 -Phadoop-2.4
-Dhadoop.version=2.6.0 -Phive -DskipTests -Dscala-2.11
2. Tested pyspark, mlib - running as well as compare results with 1.3.0
pyspark works well
+1
On Sun, Apr 5, 2015 at 4:24 PM, Patrick Wendell wrote:
> Please vote on releasing the following candidate as Apache Spark version
> 1.2.2!
>
> The tag to be voted on is v1.2.2-rc1 (commit 7531b50):
>
> https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=7531b50e406ee2e3301b009ceea7
+1 (non-binding, of course)
1. Compiled OSX 10.10 (Yosemite) OK Total time: 15:04 min
mvn clean package -Pyarn -Dyarn.version=2.6.0 -Phadoop-2.4
-Dhadoop.version=2.6.0 -Phive -DskipTests -Dscala-2.11
2. Tested pyspark, mlib - running as well as compare results with 1.3.0
pyspark works well
Excellent, Thanks Xiangrui. The mystery is solved.
Cheers
On Mon, Mar 9, 2015 at 3:30 PM, Xiangrui Meng wrote:
> Krishna, I tested your linear regression example. For linear
> regression, we changed its objective function from 1/n * \|A x -
> b\|_2^2 to 1/(2n) * \|Ax - b\|_2^2 to be consistent
Yep, otherwise this will become an N^2 problem - Scala versions X Hadoop
Distributions X ...
May be one option is to have a minimum basic set (which I know is what we
are discussing) and move the rest to spark-packages.org. There the vendors
can add the latest downloads - for example when 1.4 is r
+1 (non-binding, of course)
1. Compiled OSX 10.10 (Yosemite) OK Total time: 13:55 min
mvn clean package -Pyarn -Dyarn.version=2.6.0 -Phadoop-2.4
-Dhadoop.version=2.6.0 -Phive -DskipTests -Dscala-2.11
2. Tested pyspark, mlib - running as well as compare results with 1.1.x &
1.2.x
pyspark wo
015 at 11:15 PM, Krishna Sankar
> wrote:
> > +1 (non-binding, of course)
> >
> > 1. Compiled OSX 10.10 (Yosemite) OK Total time: 13:53 min
> > mvn clean package -Pyarn -Dyarn.version=2.6.0 -Phadoop-2.4
> > -Dhadoop.version=2.6.0 -Phive -DskipTests -Dscala-2.11
&
+1 (non-binding, of course)
1. Compiled OSX 10.10 (Yosemite) OK Total time: 13:53 min
mvn clean package -Pyarn -Dyarn.version=2.6.0 -Phadoop-2.4
-Dhadoop.version=2.6.0 -Phive -DskipTests -Dscala-2.11
2. Tested pyspark, mlib - running as well as compare results with 1.1.x &
1.2.x
2.1. statisti
Excellent. Explicit toDF() works.
a) employees.toDF().registerTempTable("Employees") - works
b) Also affects saveAsParquetFile - orders.toDF().saveAsParquetFile
Adding to my earlier tests:
4.0 SQL from Scala and Python
4.1 result = sqlContext.sql("SELECT * from Employees WHERE State = 'WA'") OK
4.
+1 (non-binding, of course)
1. Compiled OSX 10.10 (Yosemite) OK Total time: 14:50 min
mvn clean package -Pyarn -Dyarn.version=2.6.0 -Phadoop-2.4
-Dhadoop.version=2.6.0 -Phive -DskipTests -Dscala-2.11
2. Tested pyspark, mlib - running as well as compare results with 1.1.x &
1.2.x
2.1. statisti
+1 (non-binding, of course)
1. Compiled OSX 10.10 (Yosemite) OK Total time: 11:13 min
mvn clean package -Pyarn -Dyarn.version=2.6.0 -Phadoop-2.4
-Dhadoop.version=2.6.0 -Phive -DskipTests -Dscala-2.11
2. Tested pyspark, mlib - running as well as compare results with 1.1.x &
1.2.0
2.1. statisti
+1 (non-binding, of course)
1. Compiled OSX 10.10 (Yosemite) OK Total time: 12:22 min
mvn clean package -Pyarn -Dyarn.version=2.6.0 -Phadoop-2.4
-Dhadoop.version=2.6.0
-Phive -DskipTests
2. Tested pyspark, mlib - running as well as compare results with 1.1.x &
1.2.0
2.1. statistics (min,max,m
+1
1. Compiled OSX 10.10 (Yosemite) OK Total time: 12:55 min
mvn clean package -Pyarn -Dyarn.version=2.6.0 -Phadoop-2.4
-Dhadoop.version=2.6.0 -Phive -DskipTests
2. Tested pyspark, mlib - running as well as compare results with 1.1.x &
1.2.0
2.1. statistics OK
2.2. Linear/Ridge/Laso Regression
Forgot Reply To All ;o(
-- Forwarded message --
From: Krishna Sankar
Date: Wed, Dec 10, 2014 at 9:16 PM
Subject: Re: [VOTE] Release Apache Spark 1.2.0 (RC2)
To: Matei Zaharia
+1
Works same as RC1
1. Compiled OSX 10.10 (Yosemite) mvn -Pyarn -Phadoop-2.4
-Dhadoop.version=2.4.0
On Sun, Nov 30, 2014 at 6:49 AM, Krishna Sankar
> wrote:
> > +1
> > 1. Compiled OSX 10.10 (Yosemite) mvn -Pyarn -Phadoop-2.4
> > -Dhadoop.version=2.4.0 -DskipTests clean package 16:46 min (slightly
> slower
> > connection)
> > 2. Tested pyspark, mlib - running
+1
1. Compiled OSX 10.10 (Yosemite) mvn -Pyarn -Phadoop-2.4
-Dhadoop.version=2.4.0 -DskipTests clean package 16:46 min (slightly slower
connection)
2. Tested pyspark, mlib - running as well as compare esults with 1.1.x
2.1. statistics OK
2.2. Linear/Ridge/Laso Regression OK
Slight difference
Looks like the documentation hasn't caught up with the new features.
On the machine learning side, for example org.apache.spark.ml,
RandomForest, gbtree and so forth. Is a refresh of the documentation
planned ?
Am happy to see these capabilities, but these would need good explanations
as well, espe
+1
1. Compiled OSX 10.10 (Yosemite) mvn -Pyarn -Phadoop-2.4
-Dhadoop.version=2.4.0 -DskipTests clean package 10:49 min
2. Tested pyspark, mlib
2.1. statistics OK
2.2. Linear/Ridge/Laso Regression OK
2.3. Decision Tree, Naive Bayes OK
2.4. KMeans OK
2.5. rdd operations OK
2.6. recommendation OK
2.7.
+1
1. Compiled OSX 10.10 (Yosemite) mvn -Pyarn -Phadoop-2.4
-Dhadoop.version=2.4.0 -DskipTests clean package 10:49 min
2. Tested pyspark, mlib
2.1. statistics OK
2.2. Linear/Ridge/Laso Regression OK
2.3. Decision Tree, Naive Bayes OK
2.4. KMeans OK
2.5. rdd operations OK
2.6. recommendation OK
2.7.
Well done guys. MapReduce sort at that time was a good feat and Spark now
has raised the bar with the ability to sort a PB.
Like some of the folks in the list, a summary of what worked (and didn't)
as well as the monitoring practices would be good.
Cheers
P.S: What are you folks planning next ?
O
+1
- Compiled rc2 w/ CentOS 6.5, Yarn,Hadoop 2.2.0 - successful
- Smoke Test (scala,python) (distributed cluster) - successful
- We had ran Java/SparkSQL (count, distinct et al) ~250M records RDD
over HBase 0.98.3 over last build (rc1) - successful
- Stand alone multi-node cluster i
+1
Compiled for CentOS 6.5, deployed in our 4 node cluster (Hadoop 2.2, YARN)
Smoke Tests (sparkPi,spark-shell, web UI) successful
Cheers
On Thu, Jun 26, 2014 at 7:06 PM, Patrick Wendell wrote:
> Please vote on releasing the following candidate as Apache Spark version
> 1.0.1!
>
> The tag to
Stephen,
We are working thru Dell configurations; would be happy to review your
diagrams and offer feedback from our experience. Let me know the URLs.
Cheers
On Thu, Jun 5, 2014 at 2:51 PM, Stephen Watt wrote:
> Hi Folks
>
> My name is Steve Watt and I work in the CTO Office at Red Hat. I'
+1
Pulled & built on MacOS X, EC2 Amazon Linux
Ran test programs on OS X, 5 node c3.4xlarge cluster
Cheers
On Wed, May 28, 2014 at 7:36 PM, Andy Konwinski wrote:
> +1
> On May 28, 2014 7:05 PM, "Xiangrui Meng" wrote:
>
> > +1
> >
> > Tested apps with standalone client mode and yarn cluster and
59 matches
Mail list logo