hive initialization on executors

2015-04-26 Thread Manku Timma
I am facing an exception "Hive.get() called without a hive db setup" in the executor. I wanted to understand how Hive object is initialized in the executor threads? I only see Hive.get(hiveconf) in two places in spark 1.3 code. In HiveContext.scala - I dont think this is created on the executor In

Re: Design docs: consolidation and discoverability

2015-04-26 Thread Patrick Wendell
I actually don't totally see why we can't use Google Docs provided it is clearly discoverable from the JIRA. It was my understanding that many projects do this. Maybe not (?). If it's a matter of maintaining public record on ASF infrastructure, perhaps we can just automate that if an issue is clos

Re: WebUI shows poor locality when task scheduling

2015-04-26 Thread Patrick Wendell
Hi Eric - please direct this to the user@ list. This list is for development of Spark itself. On Sun, Apr 26, 2015 at 1:12 AM, eric wong wrote: > > > > Hi developers, > > I have sent to user mail list but no response... > > When running a exprimental KMeans job for expriment, the Cached RDD is >

Re: Spark timeout issue

2015-04-26 Thread Deepak Gopalakrishnan
Hello All, I'm trying to process a 3.5GB file on standalone mode using spark. I could run my spark job succesfully on a 100MB file and it works as expected. But, when I try to run it on the 3.5GB file, I run into the below error : 15/04/26 12:45:50 INFO BlockManagerMaster: Updated info of block

WebUI shows poor locality when task scheduling

2015-04-26 Thread eric wong
Hi developers, I have sent to user mail list but no response... When running a exprimental KMeans job for expriment, the Cached RDD is original Points data. I saw poor locality in Task details from WebUI. Almost one half of the input of task is Network instead of Memory. And Task with network i