Re: Unable to add to roles in JIRA

2015-07-07 Thread Sean Owen
PS the resolution on this is just that we've hit a JIRA limit, since the Contributor role is so big now. We have a currently-unused Developer role that barely has different permissions. I propose to move people that I recognize as regular Contributors into the Developer group to make room. Practic

TableScan vs PrunedScan

2015-07-07 Thread Gil Vernik
Hi All, I wanted to experiment a little bit with TableScan and PrunedScan. My first test was to print columns from various SQL queries. To make this test easier, i just took spark-csv and i replaced TableScan with PrunedScan. I then changed buildScan method of CsvRelation from def BuildScan =

Re: TableScan vs PrunedScan

2015-07-07 Thread Ram Sriharsha
Hi Gil You would need to prune the resulting Row as well based on the requested columns. Ram Sent from my iPhone > On Jul 7, 2015, at 3:12 AM, Gil Vernik wrote: > > Hi All, > > I wanted to experiment a little bit with TableScan and PrunedScan. > My first test was to print columns from var

Regarding master node failure

2015-07-07 Thread swetha
Hi, What happens if a master node fails in the case of Spark Streaming? Would the data be lost in that case? Thanks, Swetha -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Regarding-master-node-failure-tp13055.html Sent from the Apache Spark Develope

[RESULT] [VOTE] Release Apache Spark 1.4.1 (RC2)

2015-07-07 Thread Patrick Wendell
Hey All, This vote is cancelled in favor of RC3. - Patrick On Fri, Jul 3, 2015 at 1:15 PM, Patrick Wendell wrote: > Please vote on releasing the following candidate as Apache Spark version > 1.4.1! > > This release fixes a handful of known issues in Spark 1.4.0, listed here: > http://s.apache.

[VOTE] Release Apache Spark 1.4.1 (RC3)

2015-07-07 Thread Patrick Wendell
Please vote on releasing the following candidate as Apache Spark version 1.4.1! This release fixes a handful of known issues in Spark 1.4.0, listed here: http://s.apache.org/spark-1.4.1 The tag to be voted on is v1.4.1-rc3 (commit 3e8ae38): https://git-wip-us.apache.org/repos/asf?p=spark.git;a=co

Data interaction between various RDDs in Spark Streaming

2015-07-07 Thread swetha
Hi, Suppose I want the data to be grouped by and Id named "12345" and I have certain amount of data coming out from one batch for "12345" and I have data related to "12345" coming after 5 hours, how do I group by "12345" and have a single RDD of list? Thanks, Swetha -- View this message in con

Re: [VOTE] Release Apache Spark 1.4.1 (RC3)

2015-07-07 Thread Andrew Or
+1 Verified that the previous blockers SPARK-8781 and SPARK-8819 are now resolved. 2015-07-07 12:06 GMT-07:00 Patrick Wendell : > Please vote on releasing the following candidate as Apache Spark version > 1.4.1! > > This release fixes a handful of known issues in Spark 1.4.0, listed here: > http

Re: Data interaction between various RDDs in Spark Streaming

2015-07-07 Thread Akhil Das
UpdatestateByKey? Thanks Best Regards On Wed, Jul 8, 2015 at 1:05 AM, swetha wrote: > Hi, > > Suppose I want the data to be grouped by and Id named "12345" and I have > certain amount of data coming out from one batch for "12345" and I have > data > related to "12345" coming after 5 hours, how

Re: Unable to add to roles in JIRA

2015-07-07 Thread Reynold Xin
I've been adding people to the developer role to get around the jira limit. On Tue, Jul 7, 2015 at 3:05 AM, Sean Owen wrote: > PS the resolution on this is just that we've hit a JIRA limit, since > the Contributor role is so big now. > > We have a currently-unused Developer role that barely has

Re: Unable to add to roles in JIRA

2015-07-07 Thread Sean Owen
Yeah, I've just realized a problem, that the permission for Developer are not the same as Contributor. It includes the ability to Assign, but doesn't seem to include other more basic permission. I cleared room in Contributor the meantime (no point in having Committers there; Committer permission i

Re: Unable to add to roles in JIRA

2015-07-07 Thread Reynold Xin
BTW Infra has the ability to create multiple groups. Maybe that's a better solution. Have contributor1, contributor2, contributor3 ... On Tue, Jul 7, 2015 at 1:42 PM, Sean Owen wrote: > Yeah, I've just realized a problem, that the permission for Developer > are not the same as Contributor. It i

Re: [VOTE] Release Apache Spark 1.4.1 (RC3)

2015-07-07 Thread Krishna Sankar
+1 (non-binding, of course) 1. Compiled OSX 10.10 (Yosemite) OK Total time: 27:24 min mvn clean package -Pyarn -Phadoop-2.6 -DskipTests 2. Tested pyspark, mllib 2.1. statistics (min,max,mean,Pearson,Spearman) OK 2.2. Linear/Ridge/Laso Regression OK 2.3. Decision Tree, Naive Bayes OK 2.4. KMea

spark - redshift !!!

2015-07-07 Thread spark user
Hi Can you help me how to load data from s3 bucket to  redshift , if you gave sample code can you pls send me  Thanks su

thrift server reliability issue

2015-07-07 Thread Judy Nash
Hi everyone, Found a thrift server reliability issue on spark 1.3.1 that causes thrift to fail. When thrift server has too little memory allocated to the driver to process the request, its Spark SQL session exits with OutOfMemory exception, causing thrift server to stop working. Is this a kno

Spark job hangs when History server events are written to hdfs

2015-07-07 Thread Pankaj Arora
Hi, I am running long running application over yarn using spark and I am facing issues while using spark’s history server when the events are written to hdfs. It seems to work fine for some time and in between I see following exception. 2015-06-01 00:00:03,247 [SparkListenerBus] ERROR org.apa

RE: thrift server reliability issue

2015-07-07 Thread Cheng, Hao
Yes, it's a known issue, either set a bigger heap size for driver, or you can try to set the ` spark.sql.thriftServer.incrementalCollect=true` , it's work around for the query returns a huge result set. From: Judy Nash [mailto:judyn...@exchange.microsoft.com] Sent: Wednesday, July 8, 2015 11:53