Re: [VOTE] Release Apache Spark 1.4.0 (RC2)

2015-05-25 Thread jameszhouyi
Compiled: git clone https://github.com/apache/spark.git git checkout tags/v1.4.0-rc2 ./make-distribution.sh --tgz --skip-java-test -Pyarn -Phadoop-2.4 -Dhadoop.version=2.5.0 -Phive -Phive-0.13.1 -Phive-thriftserver -DskipTests Block issue in RC1/RC2: https://issues.apache.org/jira/browse/SPARK-711

SparkR and RDDs

2015-05-25 Thread Andrew Psaltis
Hi, I understand from SPARK-6799[1] and the respective merge commit [2] that the RDD class is private in Spark 1.4 . If I wanted to modify the old Kmeans and/or LR examples so that the computation happened in Spark what is the best direction to go? Sorry if I am missing something obvious, but base

Re: Change for submitting to yarn in 1.3.1

2015-05-25 Thread Chester Chen
I put the design requirements and description in the commit comment. So I will close the PR. please refer the following commit https://github.com/AlpineNow/spark/commit/5b336bbfe92eabca7f4c20e5d49e51bb3721da4d On Mon, May 25, 2015 at 3:21 PM, Chester Chen wrote: > All, > I have created a

Re: Change for submitting to yarn in 1.3.1

2015-05-25 Thread Chester Chen
All, I have created a PR just for the purpose of helping document the use case, requirements and design. As it is unlikely to get merge in. So it only used to illustrate the problems we trying and solve and approaches we took. https://github.com/apache/spark/pull/6398 Hope this helps

Re: [VOTE] Release Apache Spark 1.4.0 (RC2)

2015-05-25 Thread Olivier Girardot
I've just tested the new window functions using PySpark in the Spark 1.4.0 rc2 distribution for hadoop 2.4 with and without hive support. It works well with the hive support enabled distribution and fails as expected on the other one (with an explicit error : "Could not resolve window function 'le

Hive metadata operations support

2015-05-25 Thread Igor Mazur
Hi! I've found that Spark only supports ExecuteStatementOperation in SparkSQLOperationManager. Are there any plans to support others metadata operations? Why this question occurs - I'm trying to connect PrestoDB through standard hive jdbc driver and it doesn't see any tables that registered as

RE: [VOTE] Release Apache Spark 1.4.0 (RC2)

2015-05-25 Thread Wang, Daoyuan
Good catch! BTW, SPARK-6784 is duplicate to SPAKR-7790, didn't notice we changed the title of SPARK-7853.. -Original Message- From: Cheng, Hao [mailto:hao.ch...@intel.com] Sent: Monday, May 25, 2015 4:47 PM To: Sean Owen; Patrick Wendell Cc: dev@spark.apache.org Subject: RE: [VOTE] Rele

RE: [VOTE] Release Apache Spark 1.4.0 (RC2)

2015-05-25 Thread Cheng, Hao
Add another Blocker issue, just created! It seems a regression. https://issues.apache.org/jira/browse/SPARK-7853 -Original Message- From: Sean Owen [mailto:so...@cloudera.com] Sent: Monday, May 25, 2015 3:37 PM To: Patrick Wendell Cc: dev@spark.apache.org Subject: Re: [VOTE] Release Apa

Re: [VOTE] Release Apache Spark 1.4.0 (RC2)

2015-05-25 Thread Sean Owen
We still have 1 blocker for 1.4: SPARK-6784 Make sure values of partitioning columns are correctly converted based on their data types CC Davies Liu / Adrian Wang to check on the status of this. There are still 50 Critical issues tagged for 1.4, and 183 issues targeted for 1.4 in general. Obviou

Re: Tungsten's Vectorized Execution

2015-05-25 Thread Reynold Xin
Yes that's exactly the reason. On Sat, May 23, 2015 at 12:37 AM, Yijie Shen wrote: > Davies and Reynold, > > Glad to hear about the status. > > I’ve seen [SPARK-7813](https://issues.apache.org/jira/browse/SPARK-7813) > and watching it now. > > If I understand correctly, it’s aimed at moving Cod