Any chance you can convert the data to a tab separated text file and try the same query?
It may not be the SerDe, but it may be good to isolate that away as a potential source of the problem. -Ajo. On Wed, Jan 26, 2011 at 5:47 PM, Christopher, Pat < patrick.christop...@hp.com> wrote: > Hi, > > I’m attempting to load a small to medium sized log file, ~250MB, and > produce some basic reports from it, counts etc. Nothing fancy. However, > whenever I try and read the entire dataset, ~330k rows, I get the following > error: > > > > FAILED: Execution Error, return code 2 from > org.apache.hadoop.hive.ql.exec.MapRedTask > > > > This result gets produced with basic queries like: > > > > SELECT count(1) FROM medium_table; > > > > However, if do the following: > > > > SELECT count(1) FROM ( SELECT col1 FROM medium_table LIMIT 70000 ) tbl; > > > > It works okay until I get to around 70,800ish then I get the first error > message again. I’m running my HDFS system in single node, pseudo > distributed mode with 1.5GB of memory and 20 GB of disk as a virtual > machine. And I am using a custom SerDe. I don’t think it’s the SerDe but > I’m open to suggestions for how I can check if it is causing the problem. I > can’t see anything in the data that would be causing it though. > > > > Anyone have any ideas of what might be causing this or something I can > check? > > > > Thanks, > > Pat >