The uri provided in your log is wrong, check http://10.40.35.54:9103/tasklog?attemptid=attempt_201209171421_0029_m_000000_0&start=-8193
Shouldn't this misleading uri issue be a bug? On Tue, Sep 18, 2012 at 11:14 PM, Aniket Daoo <ad...@infocepts.com> wrote: > Hi, > > > > I have a ROW FORMAT SERDE table created using the following DDL. > > > > CREATE external TABLE multivalset_u6 > > ( > > col1 string, > > col2 string, > > col3 string, > > col4 string, > > col5 string > > ) > > ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe' > > WITH SERDEPROPERTIES > > ( > > "input.regex" = "(.*)\\t(.*)\\t~([0-9]{6})~(.*)~(.*)", > > "output.format.string" = "%1$s %2$s %3$s %4$s %5$s" > > ) > > STORED AS TEXTFILE > > LOCATION '/user/admin/u6/parsed/'; > > > > The above table LOCATION contains the file to be read by this table. I need > the original file to be parsed and stored as a tab delimited file with 5 > columns. > > > > Sample row from original file: > > 02-15-2012-11:34:56 873801356593332362 ~3261961~1~10.0 > > > > Sample row from the expected parsed file: > > 02-15-2012-11:34:56 873801356593332362 3261961 1 10.0 > > > > To do this, I was trying to create a table with 5 columns at another > location and insert data from the table multivalset_u6 into it. I > encountered the following message on the console while doing so. > > > > Ended Job = job_201209171421_0029 with errors > > Error during job, obtaining debugging information... > > Examining task ID: task_201209171421_0029_m_000002 (and more) from job > job_201209171421_0029 > > Exception in thread "Thread-47" java.lang.RuntimeException: Error while > reading from task log url > > at > org.apache.hadoop.hive.ql.exec.errors.TaskLogProcessor.getErrors(TaskLogProcessor.java:130) > > at > org.apache.hadoop.hive.ql.exec.JobDebugger.showJobFailDebugInfo(JobDebugger.java:211) > > at > org.apache.hadoop.hive.ql.exec.JobDebugger.run(JobDebugger.java:81) > > at java.lang.Thread.run(Thread.java:662) > > Caused by: java.io.IOException: Server returned HTTP response code: 400 for > URL: > http://10.40.35.54:9103/tasklog?taskid=attempt_201209171421_0029_m_000000_0&start=-8193 > > at > sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1436) > > at java.net.URL.openStream(URL.java:1010) > > at > org.apache.hadoop.hive.ql.exec.errors.TaskLogProcessor.getErrors(TaskLogProcessor.java:120) > > ... 3 more > > Counters: > > FAILED: Execution Error, return code 2 from > org.apache.hadoop.hive.ql.exec.MapRedTask > > MapReduce Jobs Launched: > > Job 0: Map: 1 HDFS Read: 0 HDFS Write: 0 FAIL > > Total MapReduce CPU Time Spent: 0 msec > > > > I have observed that when I execute a SELECT * FROM multivalset_u6, I get > the output with all the columns as expected. However, on executing a SELECT > on individual columns like SELECT col1, col2, col3, col4, col5 FROM > multivalset_u6, a similar error message appears. > > > > Am I missing something here? Is there a way to work around this? > > > > Thanks, > > Aniket