[jira] [Created] (HIVE-10648) LLAP: registry; Tez attempted to schedule to daemon that didn't exist

Sergey Shelukhin (JIRA) Thu, 07 May 2015 14:53:11 -0700

Sergey Shelukhin created HIVE-10648:
---------------------------------------


             Summary: LLAP: registry; Tez attempted to schedule to daemon that 
didn't exist
                 Key: HIVE-10648
                 URL: https://issues.apache.org/jira/browse/HIVE-10648
             Project: Hive
          Issue Type: Sub-task
            Reporter: Sergey Shelukhin
            Assignee: Gopal V


I can post logs externally; for now app IDs on test cluster are 
application_1429683757595_0784 and application_1429683757595_0783, I also have 
logs copied over.
AM found the node (same logs for other nodes):
{noformat}
2015-05-07 12:13:28,074 INFO 
[ServiceThread:org.apache.tez.dag.app.rm.TaskSchedulerEventHandler] 
impl.LlapYarnRegistryImpl: Adding new worker 
342f4992-2608-43ab-a119-b50882e35f75 which mapped to DynamicServiceInstance 
[alive=true, host=cn059-10.l42scl.hortonworks.com:15001 with 
resources=<memory:20480, vCores:6>]
....
2015-05-07 12:13:28,082 INFO [Dispatcher thread: Central] node.AMNodeTracker: 
Num cluster nodes = 19
{noformat}

Trouble is, this node never actually existed... The cluster only had 15 nodes. 
As the job was progressing, AM repeatedly tried to schedule to this node and 
failed. There was no other LLAP cluster running at the same time.
In fact, given that I always start a 15-node cluster I am not sure where 
19-node data could conceivably come from...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-10648) LLAP: registry; Tez attempted to schedule to daemon that didn't exist

Reply via email to