We ran into this YET AGAIN today, leading one of us to start checking
configuration settings. We determined our TaskTracker JVM was configured
with not too much memory (200MB?). We increased that (512MB?) and now the
error is gone. So it appears it's a TaskTracker getting OOM killed.


On Tue, Jun 28, 2011 at 10:40 AM, Time Less <timelessn...@gmail.com> wrote:

> I'm having a hard time interpreting the JIRA - it seems to be saying that
> Hive is passing an incorrect "mapred.child.java.opts=XmxNNNM" parameter,
> missing the - ... Is that correct? I could dig into the Hive source code and
> look for this bit if that's so.
>
> I admit, I'm a Java newbie, so I'm not entirely certain I understand how
> Hive/Hadoop interact, but if Hive is passing incorrect memory arguments to
> Hadoop tasks, and they're OOMing, then it seems the sort of fix I could
> undertake successfully.
>
>
>
> On Mon, Jun 27, 2011 at 5:13 PM, Sumanth V <vsumant...@gmail.com> wrote:
>
>> You are hitting this bug -
>> https://issues.apache.org/jira/browse/HIVE-1579
>> I consistently hit this bug for one of the Hive queries.
>>
>>
>> Sumanth
>>
>>
>>
>> On Mon, Jun 27, 2011 at 5:08 PM, Time Less <timelessn...@gmail.com>wrote:
>>
>>> Today I'm getting this error again. A Google search brought me back to...
>>> you guessed it... my own post. But this time no HDFS corruption. Bounced all
>>> services, namenode, jobtracker, datanodes, tasktrackers. Still same error.
>>> Here's what it looks like:
>>>
>>> *Fsck Output:
>>> *FSCK ended at Mon Jun 27 17:02:07 PDT 2011 in 818 milliseconds
>>> The filesystem under path '/' is HEALTHY
>>>
>>> *Hive Query Output:
>>> *-bash-3.2$ hive -e "select * from air_client_logs where loglevel =
>>> '[FATAL]'"
>>> Hive history file=/tmp/hdfs/hive_job_log_hdfs_201106271702_317571839.txt
>>> Total MapReduce jobs = 1
>>> Launching Job 1 out of 1
>>> Number of reduce tasks is set to 0 since there's no reduce operator
>>> Starting Job = job_201106271658_0002, Tracking URL =
>>> http://hadooptest1:50030/jobdetails.jsp?jobid=job_201106271658_0002
>>> Kill Command = /usr/lib/hadoop/bin/hadoop job
>>> -Dmapred.job.tracker=hadooptest1:54311 -kill job_201106271658_0002
>>> 2011-06-27 17:02:39,874 Stage-1 map = 0%,  reduce = 0%
>>> 2011-06-27 17:03:02,111 Stage-1 map = 100%,  reduce = 100%
>>> Ended Job = job_201106271658_0002 with errors
>>>
>>> java.lang.RuntimeException: Error while reading from task log url
>>>     at
>>> org.apache.hadoop.hive.ql.exec.errors.TaskLogProcessor.getErrors(TaskLogProcessor.java:130)
>>>     at
>>> org.apache.hadoop.hive.ql.exec.ExecDriver.showJobFailDebugInfo(ExecDriver.java:889)
>>>     at
>>> org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:680)
>>>     at
>>> org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:123)
>>>     at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:130)
>>>     at
>>> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
>>>     at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1063)
>>>     at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:900)
>>>     at org.apache.hadoop.hive.ql.Driver.run(Driver.java:748)
>>>     at
>>> org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:164)
>>>     at
>>> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:241)
>>>     at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:425)
>>>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>
>>>     at
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>>     at
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>>     at java.lang.reflect.Method.invoke(Method.java:597)
>>>     at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
>>> Caused by: java.io.IOException: Server returned HTTP response code: 400
>>> for URL:
>>> http://hadooptest14:50060/tasklog?taskid=attempt_201106271658_0002_m_000008_0&all=true
>>>
>>>     at
>>> sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1436)
>>>     at java.net.URL.openStream(URL.java:1010)
>>>     at
>>> org.apache.hadoop.hive.ql.exec.errors.TaskLogProcessor.getErrors(TaskLogProcessor.java:120)
>>>     ... 16 more
>>> Ended Job = job_201106271658_0002 with exception
>>> 'java.lang.RuntimeException(Error while reading from task log url)'
>>>
>>> FAILED: Execution Error, return code 1 from
>>> org.apache.hadoop.hive.ql.exec.MapRedTask
>>>
>>> Again, trying to go to "
>>> http://hadooptest14:50060/tasklog?taskid=attempt_201106271658_0002_m_000008_0&all=true";
>>> returns that argument attemptid error.
>>>
>>> What am I doing wrong here? I appear to keep doing it, whatever it is.
>>>
>>>
>>>
>>> On Fri, May 6, 2011 at 6:47 PM, Time Less <timelessn...@gmail.com>wrote:
>>>
>>>> My cluster went corrupt-mode. I wiped it and deleted the Hive metastore
>>>> and started over. In the process, I did a "yum upgrade" which probably took
>>>> me from CDH3b4 to CDH3u0. Now everytime I submit a Hive query of complexity
>>>> requiring a map/reduce job*, I get this error:
>>>>
>>>> 2011-05-06 18:39:14,533 Stage-1 map = 100%,  reduce = 100%
>>>> Ended Job = job_201104081532_0509 with errors
>>>> java.lang.RuntimeException: Error while reading from task log url
>>>>     at
>>>> org.apache.hadoop.hive.ql.exec.errors.TaskLogProcessor.getErrors(TaskLogProcessor.java:130)
>>>>     at
>>>> org.apache.hadoop.hive.ql.exec.ExecDriver.showJobFailDebugInfo(ExecDriver.java:889)
>>>>     at
>>>> org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:680)
>>>>     at
>>>> org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:123)
>>>> .....[snip].....
>>>>     at
>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>>>     at
>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>>>     at java.lang.reflect.Method.invoke(Method.java:597)
>>>>     at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
>>>> Caused by: java.io.IOException: Server returned HTTP response code: 400
>>>> for URL:
>>>> http://hadooptest3:50060/tasklog?taskid=attempt_201104081532_0509_m_000002_2&all=true
>>>>     at
>>>> sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1436)
>>>>     at java.net.URL.openStream(URL.java:1010)
>>>>     at
>>>> org.apache.hadoop.hive.ql.exec.errors.TaskLogProcessor.getErrors(TaskLogProcessor.java:120)
>>>>     ... 16 more
>>>> Ended Job = job_201104081532_0509 with exception
>>>> 'java.lang.RuntimeException(Error while reading from task log url)'
>>>> FAILED: Execution Error, return code 1 from
>>>> org.apache.hadoop.hive.ql.exec.MapRedTask
>>>>
>>>> It seems to me the key point in here is this:
>>>>
>>>> Server returned HTTP response code: 400 for URL:
>>>> http://hadooptest3:50060/tasklog?taskid=attempt_201104081532_0509_m_000002_2&all=true
>>>>
>>>> So I submitted that URL to my web browser, which said:
>>>>
>>>> Problem accessing /tasklog. Reason: Argument attemptid is required
>>>>
>>>> Does anyone have clues what this means?
>>>>
>>>> --
>>>> Tim Ellis
>>>> Riot Games
>>>> * That is to say, simple queries like "select * from table limit 10"
>>>> return results just fine.
>>>>
>>>>
>>>
>>>
>>> --
>>> Tim
>>>
>>
>>
>
>
> --
> Tim
>



-- 
Tim

Reply via email to