Re: Container launch failed Error

Daniel Haviv Mon, 24 Nov 2014 09:45:13 -0800

Good luck
Share your results with us

Daniel


> On 24 בנוב׳ 2014, at 19:36, Amit Behera <amit.bd...@gmail.com> wrote:
> 
> Hi Daniel,
> 
> Thanks a lot,
> 
> 
> I will do that and rerun the query. :)
> 
>> On Mon, Nov 24, 2014 at 10:59 PM, Daniel Haviv 
>> <daniel.ha...@veracity-group.com> wrote:
>> It is a problem as the application master needs to contact the other nodes
>> 
>> Try updating the hosts file on all the machines and try again.
>> 
>> Daniel
>> 
>>> On 24 בנוב׳ 2014, at 19:26, Amit Behera <amit.bd...@gmail.com> wrote:
>>> 
>>> I did not modify in all the slaves. except slave 
>>> 
>>> will it be a problem ?
>>> 
>>> But for small data (up to 20 GB table) it is running and for 300GB table 
>>> only count(*) running sometimes and sometimes failed  
>>> 
>>> Thanks
>>> Amit  
>>> 
>>>> On Mon, Nov 24, 2014 at 10:37 PM, Daniel Haviv 
>>>> <daniel.ha...@veracity-group.com> wrote:
>>>> did you copy the hosts file to all the nodes?
>>>> 
>>>> Daniel
>>>> 
>>>>> On 24 בנוב׳ 2014, at 19:04, Amit Behera <amit.bd...@gmail.com> wrote:
>>>>> 
>>>>> hi Daniel,
>>>>> 
>>>>> 
>>>>> this stacktrace same for other query .
>>>>> for different run I am getting slave7 sometime slave8... 
>>>>> 
>>>>> And also I registered all machine IPs in /etc/hosts 
>>>>> 
>>>>> Regards
>>>>> Amit
>>>>> 
>>>>> 
>>>>> 
>>>>>> On Mon, Nov 24, 2014 at 10:22 PM, Daniel Haviv 
>>>>>> <daniel.ha...@veracity-group.com> wrote:
>>>>>> It seems that the application master can't resolve slave6's name to an IP
>>>>>> 
>>>>>> Daniel
>>>>>> 
>>>>>>> On 24 בנוב׳ 2014, at 18:49, Amit Behera <amit.bd...@gmail.com> wrote:
>>>>>>> 
>>>>>>> Hi Users,
>>>>>>> 
>>>>>>> my cluster(1+8) configuration:
>>>>>>> 
>>>>>>> RAM  : 32 GB each
>>>>>>> HDFS : 1.5 TB SSD
>>>>>>> CPU   : 8 core each
>>>>>>> 
>>>>>>> -----------------------------------------------
>>>>>>> 
>>>>>>> I am trying to query on 300GB of table but I am able to run only select 
>>>>>>> query.
>>>>>>> 
>>>>>>> Except select query , for all other query I am getting following 
>>>>>>> exception.
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> Total jobs = 1
>>>>>>> Stage-1 is selected by condition resolver.
>>>>>>> Launching Job 1 out of 1
>>>>>>> Number of reduce tasks not specified. Estimated
>>>>>>> from input data size: 183
>>>>>>> In order to change the average load for a
>>>>>>> reducer (in bytes):
>>>>>>>   set
>>>>>>> hive.exec.reducers.bytes.per.reducer=<number>
>>>>>>> In order to limit the maximum number of
>>>>>>> reducers:
>>>>>>>   set hive.exec.reducers.max=<number>
>>>>>>> In order to set a constant number of reducers:
>>>>>>>   set mapreduce.job.reduces=<number>
>>>>>>> Starting Job = job_1416831990090_0005, Tracking
>>>>>>> URL = http://master:8088/proxy/application_1416831990090_0005/
>>>>>>> Kill Command = /root/hadoop/bin/hadoop job 
>>>>>>> -kill job_1416831990090_0005
>>>>>>> Hadoop job information for Stage-1: number of
>>>>>>> mappers: 679; number of reducers: 183
>>>>>>> 2014-11-24 19:43:01,523 Stage-1 map = 0%, 
>>>>>>> reduce = 0%
>>>>>>> 2014-11-24 19:43:22,730 Stage-1 map = 53%, 
>>>>>>> reduce = 0%, Cumulative CPU 625.19 sec
>>>>>>> 2014-11-24 19:43:23,778 Stage-1 map = 100%, 
>>>>>>> reduce = 100%
>>>>>>> MapReduce Total cumulative CPU time: 10 minutes
>>>>>>> 25 seconds 190 msec
>>>>>>> Ended Job = job_1416831990090_0005 with errors
>>>>>>> Error during job, obtaining debugging
>>>>>>> information...
>>>>>>> Examining task ID:
>>>>>>> task_1416831990090_0005_m_000005 (and more) from job
>>>>>>> job_1416831990090_0005
>>>>>>> Examining task ID:
>>>>>>> task_1416831990090_0005_m_000042 (and more) from job
>>>>>>> job_1416831990090_0005
>>>>>>> Examining task ID:
>>>>>>> task_1416831990090_0005_m_000035 (and more) from job
>>>>>>> job_1416831990090_0005
>>>>>>> Examining task ID:
>>>>>>> task_1416831990090_0005_m_000065 (and more) from job
>>>>>>> job_1416831990090_0005
>>>>>>> Examining task ID:
>>>>>>> task_1416831990090_0005_m_000002 (and more) from job
>>>>>>> job_1416831990090_0005
>>>>>>> Examining task ID:
>>>>>>> task_1416831990090_0005_m_000007 (and more) from job
>>>>>>> job_1416831990090_0005
>>>>>>> Examining task ID:
>>>>>>> task_1416831990090_0005_m_000058 (and more) from job
>>>>>>> job_1416831990090_0005
>>>>>>> Examining task ID:
>>>>>>> task_1416831990090_0005_m_000043 (and more) from job
>>>>>>> job_1416831990090_0005
>>>>>>> 
>>>>>>> 
>>>>>>> Task with the most failures(4): 
>>>>>>> -----
>>>>>>> Task ID:
>>>>>>>   task_1416831990090_0005_m_000005
>>>>>>> 
>>>>>>> 
>>>>>>> URL:
>>>>>>>  
>>>>>>> http://master:8088/taskdetails.jsp?jobid=job_1416831990090_0005&tipid=task_1416831990090_0005_m_000005
>>>>>>> -----
>>>>>>> Diagnostic Messages for this Task:
>>>>>>> Container launch failed for
>>>>>>> container_1416831990090_0005_01_000112 :
>>>>>>> java.lang.IllegalArgumentException: java.net.UnknownHostException:
>>>>>>> slave6
>>>>>>>         at
>>>>>>> org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:418)
>>>>>>>         at
>>>>>>> org.apache.hadoop.security.SecurityUtil.setTokenService(SecurityUtil.java:397)
>>>>>>>         at
>>>>>>> org.apache.hadoop.yarn.util.ConverterUtils.convertFromYarn(ConverterUtils.java:233)
>>>>>>>         at
>>>>>>> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.newProxy(ContainerManagementProtocolProxy.java:211)
>>>>>>>         at
>>>>>>> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.<init>(ContainerManagementProtocolProxy.java:189)
>>>>>>>         at
>>>>>>> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy.getProxy(ContainerManagementProtocolProxy.java:110)
>>>>>>>         at
>>>>>>> org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl.getCMProxy(ContainerLauncherImpl.java:403)
>>>>>>>         at
>>>>>>> org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:138)
>>>>>>>         at
>>>>>>> org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:369)
>>>>>>>         at
>>>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>>>>>         at
>>>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>>>>>         at java.lang.Thread.run(Thread.java:745)
>>>>>>> Caused by: java.net.UnknownHostException: slave6
>>>>>>>         ... 12 more
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> FAILED: Execution Error, return code 2 from
>>>>>>> org.apache.hadoop.hive.ql.exec.mr.MapRedTask
>>>>>>> MapReduce Jobs Launched: 
>>>>>>> Job 0: Map: 679  Reduce: 183   Cumulative CPU:
>>>>>>> 625.19 sec   HDFS Read: 0 HDFS Write: 0 FAIL
>>>>>>> Total MapReduce CPU Time Spent: 10 minutes 25
>>>>>>> seconds 190 mse
>>>>>>>    
>>>>>>> 
>>>>>>> 
>>>>>>> Please help me to fix the issue.
>>>>>>> 
>>>>>>> Thanks
>>>>>>> Amit
>

Re: Container launch failed Error

Reply via email to