Python 3 support for PySpark has been merged into master

2015-04-16 Thread Josh Rosen
Hi everyone, We just merged Python 3 support for PySpark into Spark's master branch (which will become Spark 1.4.0). This means that PySpark now supports Python 2.6+, PyPy 2.5+, and Python 3.4+. To run with Python 3, download and build Spark from the master branch then configure the PYSPARK_PYTH

[RESULT] [VOTE] Release Apache Spark 1.2.2

2015-04-16 Thread Patrick Wendell
I'm gonna go ahead and close this now - thanks everyone for voting! This vote passes with 7 +1 votes (6 binding) and no 0 or -1 votes. +1: Mark Hamstra* Reynold Xin Kirshna Sankar Sean Owen* Tom Graves* Joseph Bradley* Sean McNamara* 0: -1: Thanks! - Patrick On Thu, Apr 16, 2015 at 3:27 PM, S

Re: [VOTE] Release Apache Spark 1.2.2

2015-04-16 Thread Sean Owen
No, of course Jenkins runs tests. The way some of the tests work, they need the build artifacts to have been created first. So it runs "mvn ... -DskipTests package" then "mvn ... test" On Thu, Apr 16, 2015 at 11:09 PM, Sree V wrote: > In my effort to vote for this release, I found these along: >

Re: [VOTE] Release Apache Spark 1.2.2

2015-04-16 Thread Sree V
In my effort to vote for this release, I found these along: This is from jenkins. It uses "-DskipTests". [centos] $ /home/jenkins/tools/hudson.tasks.Maven_MavenInstallation/Maven_3.0.5/bin/mvn -Dhadoop.version=2.0.0-mr1-cdh4.1.2 -Dlabel=centos -DskipTests clean packageWe build on our locals /

Gitter chat room for Spark

2015-04-16 Thread Nicholas Chammas
Would we be interested in having a public chat room? Gitter offers them for free for open source projects. It's like web-based IRC. Check out the Docker room for example: https://gitter.im/docker/docker And if people prefer to use actual IRC, Gitter offers a bridge for that <

Re: how long does it takes for full build ?

2015-04-16 Thread Sree V
Found it, Ted. Thank you.https://amplab.cs.berkeley.edu/jenkins/job/Spark-1.2-Maven-pre-YARN/hadoop.version=2.0.0-mr1-cdh4.1.2,label=centos/354/consoleFull We locally build with "-DskipTests" and on our jenkins as well. Thanking you. With Regards Sree On Thursday, April 16, 2015 1:04 PM,

Re: how long does it takes for full build ?

2015-04-16 Thread Ted Yu
You can find the command at the beginning of the console output: [centos] $ /home/jenkins/tools/hudson.tasks.Maven_MavenInstallation/Maven_3.0.5/bin/mvn -DHADOOP_PROFILE=hadoop-2.4 -Dlabel=centos -DskipTests -Phadoop-2.4 -Pyarn -Phive clean package On Thu, Apr 16, 2015 at 12:42 PM, Sree V wrot

Re: how long does it takes for full build ?

2015-04-16 Thread Sree V
+ ShaneHi Shane, Would you address 1. please ?  Thanking you. With Regards Sree On Thursday, April 16, 2015 12:46 PM, Sree V wrote: 1. 40 min+ to 1hr+, from jenkins.I didn't find the commands of the job. Does it require a login ? Part of the console output: > git checkout -f 3ae

Re: how long does it takes for full build ?

2015-04-16 Thread Sree V
1. 40 min+ to 1hr+, from jenkins.I didn't find the commands of the job. Does it require a login ? Part of the console output: > git checkout -f 3ae37b93a7c299bd8b22a36248035bca5de3422f > git rev-list de4fa6b6d12e2bee0307ffba2abfca0c33f15e45 # timeout=10 Triggering Spark-Master-Maven-pre-YARN ? 2

Re: how long does it takes for full build ?

2015-04-16 Thread Kushal Datta
15-20mins. On Thu, Apr 16, 2015 at 11:56 AM, Sree V wrote: > Hi Team, > How long does it takes for a full build 'mvn clean package' on spark > 1.2.2-rc1 ? > > > Thanking you. > > With Regards > Sree

Re: how long does it takes for full build ?

2015-04-16 Thread Ted Yu
You can get some idea by looking at the builds here: https://amplab.cs.berkeley.edu/jenkins/view/Spark/job/Spark-1.2-Maven-with-YARN/HADOOP_PROFILE=hadoop-2.4,label=centos/ Cheers On Thu, Apr 16, 2015 at 11:56 AM, Sree V wrote: > Hi Team, > How long does it takes for a full build 'mvn clean pa

how long does it takes for full build ?

2015-04-16 Thread Sree V
Hi Team, How long does it takes for a full build 'mvn clean package' on spark 1.2.2-rc1 ? Thanking you. With Regards Sree

Re: Dataframe from mysql database in pyspark

2015-04-16 Thread Reynold Xin
There is a jdbc in the SQLContext scala doc: https://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.SQLContext Note that this is more of a user list question On Thu, Apr 16, 2015 at 5:11 AM, Suraj Shetiya wrote: > Hi, > > Is there any means of transforming mysql da

Re: spark shell paste mode is not consistent

2015-04-16 Thread vinodkc
Yes I could see all actions in Spark UI. Paste command is returning last action result to console, that is why I got confused. Thank you for the help Vinod On Apr 16, 2015 5:22 PM, "Sean Owen [via Apache Spark Developers List]" < ml-node+s1001551n11624...@n3.nabble.com> wrote: > No, look at the

Re: spark shell paste mode is not consistent

2015-04-16 Thread Sean Owen
No, look at the Spark UI. You can see all three were executed. On Thu, Apr 16, 2015 at 12:05 PM, Vinod KC wrote: > Hi Sean, > > In paste mode , shell is evaluating only last action.It ignores previous > actions > > .ie , it is not executing actions > textFile.count() and textFile.first > > Tha

Re: spark shell paste mode is not consistent

2015-04-16 Thread Vinod KC
Hi Sean, In paste mode , shell is evaluating only last action.It ignores previous actions .ie , it is not executing actions textFile.count() and textFile.first Thanks Vinod I'm not sure I understand what you are suggesting is wrong. It prints the result of the last command. In the second cas

Re: spark shell paste mode is not consistent

2015-04-16 Thread Sean Owen
I'm not sure I understand what you are suggesting is wrong. It prints the result of the last command. In the second case that is the whole pasted block so you see 19. On Apr 16, 2015 11:37 AM, "vinodkc" wrote: > Hi All, > > I faced below issue while working with spark. It seems spark shell paste

spark shell paste mode is not consistent

2015-04-16 Thread vinodkc
Hi All, I faced below issue while working with spark. It seems spark shell paste mode is not consistent Example code --- val textFile = sc.textFile("README.md") textFile.count() textFile.first() val linesWithSpark = textFile.filter(line => line.contains("Spark")) textFile.filter(li

Dataframe from mysql database in pyspark

2015-04-16 Thread Suraj Shetiya
Hi, Is there any means of transforming mysql databases into dataframes from pyspark. Iwas about to find a document that converts mysql database to dataframe in spark-shell(http://www.infoobjects.com/spark-sql-jdbcrdd/) using jdbc. I had been through the official documentation and can't find any po