Re: Spark (1.2) yarn allocator does not remove container request for allocated container, resulting in a bloated ask[] of containers and inefficient resource utilization of cluster resources.

2015-07-29 Thread prakhar jauhari
hey all, Thanks in advance. I am facing this issue in production, where due to increased container request RM is reserving memory and hampering cluster utilization. Thus the fix needs to be patched on spark 1.2. Has any one looked in the removeContainerRequest part for allocated containers in sp

Re: unit test failure for hive query

2015-07-29 Thread Michael Armbrust
I'd suggest using org.apache.spark.sql.hive.test.TestHive as the context in unit tests. It takes care of creating separate directories for each invocation automatically. On Wed, Jul 29, 2015 at 7:02 PM, JaeSung Jun wrote: > Hi, > I'm working on custom sql processing on top of Spark-SQL, and i'm

unit test failure for hive query

2015-07-29 Thread JaeSung Jun
Hi, I'm working on custom sql processing on top of Spark-SQL, and i'm upgrading it along with spark 1.4.1. I've got an error regarding multiple test suites access hive meta store at the same time like : Cause: org.apache.derby.impl.jdbc.EmbedSQLException: Another instance of Derby may have alread

Re: update on git timeouts for jenkins builds

2015-07-29 Thread shane knapp
newp. still happening, and i'm still looking in to it: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/38880/console On Wed, Jul 29, 2015 at 12:20 PM, shane knapp wrote: > ok, i think i found the problem and solution to the git timeouts: > > https://stackoverflow.com/question

Re: Broadcast variable of size 1 GB fails with negative memory exception

2015-07-29 Thread Mike Hynes
Hi Imran, Thanks to you and Shivaram for looking into this, and opening the JIRA/PR. I will update you once the PR is merged if there are any other problems that arise from the broadcast. Mike On 7/29/15, Imran Rashid wrote: > Hi Mike, > > I dug into this a little more, and it turns out in this c

Re: update on git timeouts for jenkins builds

2015-07-29 Thread shane knapp
ok, i think i found the problem and solution to the git timeouts: https://stackoverflow.com/questions/12236415/git-clone-return-result-18-code-200-on-a-specific-repository so, on each worker i've run "git config --global http.postBuffer 524288000" as the jenkins user and we'll see if this makes a

Re: Broadcast variable of size 1 GB fails with negative memory exception

2015-07-29 Thread Imran Rashid
Hi Mike, I dug into this a little more, and it turns out in this case there is a pretty trivial fix -- the problem you are seeing is just from integer overflow before casting to a long in SizeEstimator. I've opened https://issues.apache.org/jira/browse/SPARK-9437 for this. For now, I think your

Re: "Spree": Live-updating web UI for Spark

2015-07-29 Thread mkhaitman
We tested this out on our dev cluster (Hadoop 2.7.1 + Spark 1.4.0), and it looks great! I might also be interested in contributing to it when I get a chance! Keep up the awesome work! :) Mark. -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Spree-Live

RE: Two joins in GraphX Pregel implementation

2015-07-29 Thread Ulanov, Alexander
Hi Ankur, Thank you! This looks like a nice simplification. There should be some performance improvement since newVerts are not chached now. I’ve added your patch: https://issues.apache.org/jira/browse/SPARK-9436 Best regards, Alexander From: Ankur Dave [mailto:ankurd...@gmail.com] Sent: Tuesda

Re: [ANNOUNCE] Nightly maven and package builds for Spark

2015-07-29 Thread Bharath Ravi Kumar
Hey Patrick, Any update on this front please? Thanks, Bharath On Fri, Jul 24, 2015 at 8:38 PM, Patrick Wendell wrote: > Hey Bharath, > > There was actually an incompatible change to the build process that > broke several of the Jenkins builds. This should be patched up in the > next day or two

Spark (1.2) yarn allocator does not remove container request for allocated container, resulting in a bloated ask[] of containers and inefficient resource utilization of cluster resources.

2015-07-29 Thread prakhar jauhari
This is because Yarn's AM client does not remove fulfilled container request from its MAP until the application's AM specifically calls removeContainerRequest for fulfilled container requests. Spark-1.2 : Spark's yarn AM does not call removeContainerRequest for fulfilled container request. Spark-