When the following query was run with mapred.job.reuse.jvm.num.tasks=20, some 
of the map tasks failed with "Error: Java heap space", causing the job to fail. 
After changing to mapred.job.reuse.jvm.num.tasks=1, the job succeeded.

FROM (
    FROM intable1
    SELECT acct_id, esn) b
JOIN (
    FROM intable2
    SELECT acct_id, xid, devtype_id, esn, other_properties['cdndldist'] AS 
cdndldist
    WHERE (dateint>=20110201 AND dateint<=20110228)
    AND (other_properties['cdndldist'] is not null)
    AND (client_msg_type='endplay')
    AND (devtype_id=272 OR devtype_id=129 OR devtype_id=12 OR devtype_id=13 OR 
devtype_id=14)) c
ON (b.acct_id=c.acct_id) AND (b.esn=c.esn)
INSERT OVERWRITE TABLE outtable PARTITION (dateint='20110201-20110228')
SELECT b.acct_id, c.xid, c.devtype_id, c.esn, c.cdndldist;

I'm on a pre-release version of Hive 0.7, with Hadoop 0.20. Does anyone know 
about any Hive/Hadoop issue that may be related to this?

Thanks.
Steven

Reply via email to