I'm having a hard time interpreting the JIRA - it seems to be saying that Hive is passing an incorrect "mapred.child.java.opts=XmxNNNM" parameter, missing the - ... Is that correct? I could dig into the Hive source code and look for this bit if that's so.
I admit, I'm a Java newbie, so I'm not entirely certain I understand how Hive/Hadoop interact, but if Hive is passing incorrect memory arguments to Hadoop tasks, and they're OOMing, then it seems the sort of fix I could undertake successfully. On Mon, Jun 27, 2011 at 5:13 PM, Sumanth V <vsumant...@gmail.com> wrote: > You are hitting this bug - https://issues.apache.org/jira/browse/HIVE-1579 > I consistently hit this bug for one of the Hive queries. > > > Sumanth > > > > On Mon, Jun 27, 2011 at 5:08 PM, Time Less <timelessn...@gmail.com> wrote: > >> Today I'm getting this error again. A Google search brought me back to... >> you guessed it... my own post. But this time no HDFS corruption. Bounced all >> services, namenode, jobtracker, datanodes, tasktrackers. Still same error. >> Here's what it looks like: >> >> *Fsck Output: >> *FSCK ended at Mon Jun 27 17:02:07 PDT 2011 in 818 milliseconds >> The filesystem under path '/' is HEALTHY >> >> *Hive Query Output: >> *-bash-3.2$ hive -e "select * from air_client_logs where loglevel = >> '[FATAL]'" >> Hive history file=/tmp/hdfs/hive_job_log_hdfs_201106271702_317571839.txt >> Total MapReduce jobs = 1 >> Launching Job 1 out of 1 >> Number of reduce tasks is set to 0 since there's no reduce operator >> Starting Job = job_201106271658_0002, Tracking URL = >> http://hadooptest1:50030/jobdetails.jsp?jobid=job_201106271658_0002 >> Kill Command = /usr/lib/hadoop/bin/hadoop job >> -Dmapred.job.tracker=hadooptest1:54311 -kill job_201106271658_0002 >> 2011-06-27 17:02:39,874 Stage-1 map = 0%, reduce = 0% >> 2011-06-27 17:03:02,111 Stage-1 map = 100%, reduce = 100% >> Ended Job = job_201106271658_0002 with errors >> >> java.lang.RuntimeException: Error while reading from task log url >> at >> org.apache.hadoop.hive.ql.exec.errors.TaskLogProcessor.getErrors(TaskLogProcessor.java:130) >> at >> org.apache.hadoop.hive.ql.exec.ExecDriver.showJobFailDebugInfo(ExecDriver.java:889) >> at >> org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:680) >> at >> org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:123) >> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:130) >> at >> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57) >> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1063) >> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:900) >> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:748) >> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:164) >> at >> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:241) >> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:425) >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> >> at >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) >> at >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >> at java.lang.reflect.Method.invoke(Method.java:597) >> at org.apache.hadoop.util.RunJar.main(RunJar.java:186) >> Caused by: java.io.IOException: Server returned HTTP response code: 400 >> for URL: >> http://hadooptest14:50060/tasklog?taskid=attempt_201106271658_0002_m_000008_0&all=true >> >> at >> sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1436) >> at java.net.URL.openStream(URL.java:1010) >> at >> org.apache.hadoop.hive.ql.exec.errors.TaskLogProcessor.getErrors(TaskLogProcessor.java:120) >> ... 16 more >> Ended Job = job_201106271658_0002 with exception >> 'java.lang.RuntimeException(Error while reading from task log url)' >> >> FAILED: Execution Error, return code 1 from >> org.apache.hadoop.hive.ql.exec.MapRedTask >> >> Again, trying to go to " >> http://hadooptest14:50060/tasklog?taskid=attempt_201106271658_0002_m_000008_0&all=true" >> returns that argument attemptid error. >> >> What am I doing wrong here? I appear to keep doing it, whatever it is. >> >> >> >> On Fri, May 6, 2011 at 6:47 PM, Time Less <timelessn...@gmail.com> wrote: >> >>> My cluster went corrupt-mode. I wiped it and deleted the Hive metastore >>> and started over. In the process, I did a "yum upgrade" which probably took >>> me from CDH3b4 to CDH3u0. Now everytime I submit a Hive query of complexity >>> requiring a map/reduce job*, I get this error: >>> >>> 2011-05-06 18:39:14,533 Stage-1 map = 100%, reduce = 100% >>> Ended Job = job_201104081532_0509 with errors >>> java.lang.RuntimeException: Error while reading from task log url >>> at >>> org.apache.hadoop.hive.ql.exec.errors.TaskLogProcessor.getErrors(TaskLogProcessor.java:130) >>> at >>> org.apache.hadoop.hive.ql.exec.ExecDriver.showJobFailDebugInfo(ExecDriver.java:889) >>> at >>> org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:680) >>> at >>> org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:123) >>> .....[snip]..... >>> at >>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) >>> at >>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >>> at java.lang.reflect.Method.invoke(Method.java:597) >>> at org.apache.hadoop.util.RunJar.main(RunJar.java:186) >>> Caused by: java.io.IOException: Server returned HTTP response code: 400 >>> for URL: >>> http://hadooptest3:50060/tasklog?taskid=attempt_201104081532_0509_m_000002_2&all=true >>> at >>> sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1436) >>> at java.net.URL.openStream(URL.java:1010) >>> at >>> org.apache.hadoop.hive.ql.exec.errors.TaskLogProcessor.getErrors(TaskLogProcessor.java:120) >>> ... 16 more >>> Ended Job = job_201104081532_0509 with exception >>> 'java.lang.RuntimeException(Error while reading from task log url)' >>> FAILED: Execution Error, return code 1 from >>> org.apache.hadoop.hive.ql.exec.MapRedTask >>> >>> It seems to me the key point in here is this: >>> >>> Server returned HTTP response code: 400 for URL: >>> http://hadooptest3:50060/tasklog?taskid=attempt_201104081532_0509_m_000002_2&all=true >>> >>> So I submitted that URL to my web browser, which said: >>> >>> Problem accessing /tasklog. Reason: Argument attemptid is required >>> >>> Does anyone have clues what this means? >>> >>> -- >>> Tim Ellis >>> Riot Games >>> * That is to say, simple queries like "select * from table limit 10" >>> return results just fine. >>> >>> >> >> >> -- >> Tim >> > > -- Tim