RE: Handling questions in the mailing lists

2016-11-23 Thread assaf.mendelson
Sorry to reawaken this, but I just noticed it is possible to propose new topic specific sites (http://area51.stackexchange.com/faq) for stack overflow. So for example we might have a spark.stackexchange.com spark specific site. The advantage of such a site are many. First of all it is spark spec

FOSDEM 2017 HPC, Bigdata and Data Science DevRoom CFP is closing soon

2016-11-23 Thread Roman Shaposhnik
Hi! apologies for the extra wide distribution (this exhausts my once a year ASF mail-to-all-bigdata-projects quota ;-)) but I wanted to suggest that all of you should consider submitting talks to FOSDEM 2017 HPC, Bigdata and Data Science DevRoom: https://hpc-bigdata-fosdem17.github.io/ It was

Re: Memory leak warnings in Spark 2.0.1

2016-11-23 Thread Nicholas Chammas
👍 Thanks for the reference and PR. On Wed, Nov 23, 2016 at 2:59 AM Reynold Xin wrote: > See https://issues.apache.org/jira/browse/SPARK-18557 > > > On Mon, Nov 21, 2016 at 1:16 PM, Nicholas Chammas < > nicholas.cham...@gmail.com> wrote: > > I'

Re: Spark Wiki now migrated to spark.apache.org

2016-11-23 Thread Nicholas Chammas
Same here. Nice to be able to deprecate most of the docs living on the wiki and refer to them on GitHub. On Wed, Nov 23, 2016 at 11:54 AM Holden Karau wrote: > That's awesome thanks for doing the migration :) > > On Wed, Nov 23, 2016 at 3:29 AM Sean Owen wrote: > > I completed the migration. Yo

Re: Spark Wiki now migrated to spark.apache.org

2016-11-23 Thread Holden Karau
That's awesome thanks for doing the migration :) On Wed, Nov 23, 2016 at 3:29 AM Sean Owen wrote: > I completed the migration. You can see the results live right now at > http://spark.apache.org, and > https://cwiki.apache.org/confluence/display/SPARK/Wiki+Homepage > > A summary of the changes:

[Spark Thriftserver] connection timeout option?

2016-11-23 Thread Artur Sukhenko
Hello devs, Lets say there is/are user(s) who are using: T*ableau desktop+spark+sparkSQL* and *Hive server* *2* is installed but they use *spark* for the thrift server connection. They are trying to configure spark to drop Thrift Connection when there is inactivity for this specific user and the

PowerIterationClustering can't handle "large" files

2016-11-23 Thread Lydia Ickler
Hi all,I have a question regarding the Power Iteration Clustering.I have an input file (tab separated edge list) which I read in and map it to the required format of RDD[(Long, Long, Double)] to then apply PIC.So far so good… The implementation works fine if the input is small (up to 50MB). But it

Aggregating over sorted data

2016-11-23 Thread assaf.mendelson
Hi, An issue I have encountered frequently is the need to look at data in an ordered manner per key. A common way of doing this can be seen in the classic map reduce as the shuffle stage provides sorted data per key and one can therefore do a lot with that. It is of course relatively easy to achi

Spark Wiki now migrated to spark.apache.org

2016-11-23 Thread Sean Owen
I completed the migration. You can see the results live right now at http://spark.apache.org, and https://cwiki.apache.org/confluence/display/SPARK/Wiki+Homepage A summary of the changes: https://issues.apache.org/jira/browse/SPARK-18073 The substance of the changes: https://github.com/apache/spa

Re: Is it possible to pass "-javaagent=customAgent.jar" into spark as a JAVA_OPTS

2016-11-23 Thread Artur Sukhenko
Hello Zak, I believe this video from Spark Summit would be useful for you: https://youtu.be/EB1-7AXQOhM They are talking about extending Spark with Java agents. On Tue, Nov 22, 2016, 23:50 Zak H wrote: > Hi, > > I'm interested in passing an agent that will expose jmx metrics from spark > to my

Re: view canonicalization - looking for database gurus to chime in

2016-11-23 Thread Jiang Xingbo
Hi all, I have recently prepared a design document for Spark SQL robust view canonicalization, in the doc we defined the expected behavior and described a late binding approach, views created by older versions of Spark/HIVE are still supposed to work under this new approach. For more details, plea