Re: spark job scheduling

2016-01-27 Thread Jakob Odersky
Nitpick: the up-to-date version of said wiki page is https://spark.apache.org/docs/1.6.0/job-scheduling.html (not sure how much it changed though) On Wed, Jan 27, 2016 at 7:50 PM, Chayapan Khannabha wrote: > I would start at this wiki page > https://spark.apache.org/docs/1.2.0/job-scheduling.html

Re: spark job scheduling

2016-01-27 Thread Chayapan Khannabha
I would start at this wiki page https://spark.apache.org/docs/1.2.0/job-scheduling.html Although I'm sure this depends a lot on your cluster environment and the deployed Spark version. IMHO On Thu, Jan 28, 2016 at 10:27 AM, Niranda Perera wrote: > Sorry I have made typos. let me rephrase > > 1

Re: spark job scheduling

2016-01-27 Thread Niranda Perera
Sorry I have made typos. let me rephrase 1. As I understand, the smallest unit of work an executor can perform, is a 'task'. In the 'FAIR' scheduler mode, let's say a job is submitted to the spark ctx which has a considerable amount of work to do in a single task. While such a 'big' task is runnin

Re: spark job scheduling

2016-01-27 Thread Chayapan Khannabha
I think the smallest unit of work is a "Task", and an "Executor" is responsible for getting the work done? Would like to understand more about the scheduling system too. Scheduling strategy like FAIR or FIFO do have significant impact on a Spark cluster architecture design decision. Best, Chayapa

Re: Re: timeout in shuffle problem

2016-01-27 Thread wangzhenhua (G)
external shuffle service is not enabled best regards, -zhenhua From: Hamel Kothari Date: 2016-01-27 22:21 To: Ted Yu; wangzhenhua (G) CC: dev Su

spark job scheduling

2016-01-27 Thread Niranda Perera
hi all, I have a few questions on spark job scheduling. 1. As I understand, the smallest unit of work an executor can perform. In the 'fair' scheduler mode, let's say a job is submitted to the spark ctx which has a considerable amount of work to do in a task. While such a 'big' task is running,

Re: Spark 2.0.0 release plan

2016-01-27 Thread Michael Armbrust
We do maintenance releases on demand when there is enough to justify doing one. I'm hoping to cut 1.6.1 soon, but have not had time yet. On Wed, Jan 27, 2016 at 8:12 AM, Daniel Siegmann < daniel.siegm...@teamaol.com> wrote: > Will there continue to be monthly releases on the 1.6.x branch during

Re: Mutiple spark contexts

2016-01-27 Thread Nicholas Chammas
There is a lengthy discussion about this on the JIRA: https://issues.apache.org/jira/browse/SPARK-2243 On Wed, Jan 27, 2016 at 1:43 PM Herman van Hövell tot Westerflier < hvanhov...@questtec.nl> wrote: > Just out of curiousity. What is the use case for having multiple active > contexts in a singl

Re: Mutiple spark contexts

2016-01-27 Thread Herman van Hövell tot Westerflier
Just out of curiousity. What is the use case for having multiple active contexts in a single JVM? Kind regards, Herman van Hövell 2016-01-27 19:41 GMT+01:00 Ashish Soni : > There is a property you need to set which is > spark.driver.allowMultipleContexts=true > > Ashish > > On Wed, Jan 27, 2016

Re: Mutiple spark contexts

2016-01-27 Thread Reynold Xin
There are no major obstacles, just a million tiny obstacles that would take forever to fix. On Wed, Jan 27, 2016 at 10:39 AM, Jakob Odersky wrote: > A while ago, I remember reading that multiple active Spark contexts > per JVM was a possible future enhancement. > I was wondering if this is stil

Re: Mutiple spark contexts

2016-01-27 Thread Ashish Soni
There is a property you need to set which is spark.driver.allowMultipleContexts=true Ashish On Wed, Jan 27, 2016 at 1:39 PM, Jakob Odersky wrote: > A while ago, I remember reading that multiple active Spark contexts > per JVM was a possible future enhancement. > I was wondering if this is still

Mutiple spark contexts

2016-01-27 Thread Jakob Odersky
A while ago, I remember reading that multiple active Spark contexts per JVM was a possible future enhancement. I was wondering if this is still on the roadmap, what the major obstacles are and if I can be of any help in adding this feature? regards, --Jakob ---

Adding Naive Bayes sample code in Documentation

2016-01-27 Thread Vinayak Agrawal
Hi, I was reading through Spark ML package and I couldn't find Naive Bayes examples documented on the spark documentation page. http://spark.apache.org/docs/latest/ml-classification-regression.html However, the API exists and can be used. https://spark.apache.org/docs/1.5.2/api/python/pyspark.ml.h

RE: spark hivethriftserver problem on 1.5.0 -> 1.6.0 upgrade

2016-01-27 Thread james.gre...@baesystems.com
Thanks Yin, here are the logs: INFO SparkContext - Added JAR file:/home/jegreen1/mms/zookeeper-3.4.6.jar at http://10.39.65.122:38933/jars/zookeeper-3.4.6.jar with timestamp 1453907484092 INFO SparkContext - Added JAR file:/home/jegreen1/mms/mms-http-0.2-SNAPSHOT.jar at http://10.39.65.12

Re: timeout in shuffle problem

2016-01-27 Thread Hamel Kothari
Are you running on YARN? Another possibility here is that your shuffle managers are facing GC pain and becoming less responsive, thus missing timeouts. Can you try increasing the memory on the node managers and see if that helps? On Sun, Jan 24, 2016 at 4:58 PM Ted Yu wrote: > Cycling past bits:

Re: Using distinct count in over clause

2016-01-27 Thread Herman van Hövell tot Westerflier
Hi, We currently do not support distinct clauses in window functions. Nor is such functionality planned. Spark 2.0 uses native spark UDAFs (instead of Hive window functions) and allows you to use your own UDAFs, it is trivial to implement a distinct count/sum in that case. Kind regards, Herman

Re: Using distinct count in over clause

2016-01-27 Thread Akhil Das
Does it support over? I couldn't find it in the documentation http://spark.apache.org/docs/latest/sql-programming-guide.html#supported-hive-features Thanks Best Regards On Fri, Jan 22, 2016 at 2:31 PM, 汪洋 wrote: > I think it cannot be right. > > 在 2016年1月22日,下午4:53,汪洋 写道: > > Hi, > > Do we sup

Re: Generate Amplab queries set

2016-01-27 Thread Akhil Das
Have a look at the TPC-H queries, I found this repository with the quries. https://github.com/ssavvides/tpch-spark Thanks Best Regards On Fri, Jan 22, 2016 at 1:35 AM, sara mustafa wrote: > Hi, > I have downloaded the Amplab benchmark dataset from > s3n://big-data-benchmark/pavlo/text/tiny, but

Re: BUILD FAILURE at spark-sql_2.11?!

2016-01-27 Thread Ted Yu
Strangely both Jenkins jobs showed green status: https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Compile/job/SPARK-master-COMPILE-sbt-SCALA-2.11/ https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Compile/job/SPARK-master-COMPILE-MAVEN-SCALA-2.11/ On Wed, Jan 27, 2016 at 12:47 AM,

Re: BUILD FAILURE at spark-sql_2.11?!

2016-01-27 Thread Jean-Baptiste Onofré
Thanks Jacek, I have the same issue here. Regards JB On 01/27/2016 10:15 AM, Jacek Laskowski wrote: Hi, Pull request submitted https://github.com/apache/spark/pull/10946/files. Please review and merge. Pozdrawiam, Jacek Jacek Laskowski | https://medium.com/@jaceklaskowski/ Mastering Apache

Re: BUILD FAILURE at spark-sql_2.11?!

2016-01-27 Thread Jacek Laskowski
Hi, Pull request submitted https://github.com/apache/spark/pull/10946/files. Please review and merge. Pozdrawiam, Jacek Jacek Laskowski | https://medium.com/@jaceklaskowski/ Mastering Apache Spark ==> https://jaceklaskowski.gitbooks.io/mastering-apache-spark/ Follow me at https://twitter.com/jac

Re: BUILD FAILURE at spark-sql_2.11?!

2016-01-27 Thread Jacek Laskowski
Hi, My very rough investigation has showed that the commit to may have broken the build was https://github.com/apache/spark/commit/555127387accdd7c1cf236912941822ba8af0a52 (nongli committed with rxin 7 hours ago). Found a fix and building the source again... Pozdrawiam, Jacek Jacek Laskowski |

BUILD FAILURE at spark-sql_2.11?!

2016-01-27 Thread Jacek Laskowski
Hi, Tried to build the sources today with Scala 2.11 twice and it failed. No local changes. Restarted zinc. Can anyone else confirm it? Since the error is buried in the logs I'm asking now without offering more information (before I catch the cause) so I or *the issue* get corrected (whatever fi