Any chance you can convert the data to a tab separated text file and try the
same query?

It may not be the SerDe, but it may be good to isolate that away as a
potential  source of the problem.

-Ajo.

On Wed, Jan 26, 2011 at 5:47 PM, Christopher, Pat <
patrick.christop...@hp.com> wrote:

> Hi,
>
> I’m attempting to load a small to medium sized log file, ~250MB, and
> produce some basic reports from it, counts etc.  Nothing fancy.  However,
> whenever I try and read the entire dataset, ~330k rows, I get the following
> error:
>
>
>
>   FAILED: Execution Error, return code 2 from
> org.apache.hadoop.hive.ql.exec.MapRedTask
>
>
>
> This result gets produced with basic queries like:
>
>
>
>   SELECT count(1) FROM medium_table;
>
>
>
> However, if do the following:
>
>
>
>   SELECT count(1) FROM ( SELECT col1 FROM medium_table LIMIT 70000 ) tbl;
>
>
>
> It works okay until I get to around 70,800ish then I get the first error
> message again.  I’m running my HDFS system in single node, pseudo
> distributed mode with 1.5GB of memory and 20 GB of disk as a virtual
> machine.  And I am using a custom SerDe.  I don’t think it’s the SerDe but
> I’m open to suggestions for how I can check if it is causing the problem.  I
> can’t see anything in the data that would be causing it though.
>
>
>
> Anyone have any ideas of what might be causing this or something I can
> check?
>
>
>
> Thanks,
>
> Pat
>

Reply via email to