A likely or possible cause is memory leaks. Another possibility may be heap 
fragmentation. The OOM happened when the mapper was trying to allocate the 
buffer whose size is controlled by io.sort.mb, and my setup has io.sort.mb=500 
and -Xmx1000m.


From: Igor Tatarinov [mailto:i...@decide.com]
Sent: Friday, April 08, 2011 9:50 AM
To: user@hive.apache.org
Cc: Steven Wong
Subject: Re: Mapper OOMs disappear after disabling JVM reuse

I had a similar problem until I set this parameter to 1 (although 3 seems to 
work fine too).

There is an explanation somewhere on the web. Basically, if you run 20 tasks 
and the garbage collector cannot catch up with accumulated garbage, the java 
process grows too big so when it finally decides to fork a new process, the 
forked process and the original one take up too much memory. Or something like 
that.

I imagine the same kind thing would still happen even when setting reuse to 1 
if a task happens to consume a lot of memory and CPU cycles, so something else 
must be happening here. Perhaps, there is a memory 'leak' and some 'garbage' 
never gets collected between task executions.


On Thu, Apr 7, 2011 at 6:53 PM, Steven Wong 
<sw...@netflix.com<mailto:sw...@netflix.com>> wrote:
When the following query was run with mapred.job.reuse.jvm.num.tasks=20, some 
of the map tasks failed with "Error: Java heap space", causing the job to fail. 
After changing to mapred.job.reuse.jvm.num.tasks=1, the job succeeded.

FROM (
    FROM intable1
    SELECT acct_id, esn) b
JOIN (
    FROM intable2
    SELECT acct_id, xid, devtype_id, esn, other_properties['cdndldist'] AS 
cdndldist
    WHERE (dateint>=20110201 AND dateint<=20110228)
    AND (other_properties['cdndldist'] is not null)
    AND (client_msg_type='endplay')
    AND (devtype_id=272 OR devtype_id=129 OR devtype_id=12 OR devtype_id=13 OR 
devtype_id=14)) c
ON (b.acct_id=c.acct_id) AND (b.esn=c.esn)
INSERT OVERWRITE TABLE outtable PARTITION (dateint='20110201-20110228')
SELECT b.acct_id, c.xid, c.devtype_id, c.esn, c.cdndldist;

I'm on a pre-release version of Hive 0.7, with Hadoop 0.20. Does anyone know 
about any Hive/Hadoop issue that may be related to this?

Thanks.
Steven

Reply via email to