Clément Stenac created PIG-3170:
-----------------------------------
Summary: Pig keeps static references to Hadoop's Context after end
of task
Key: PIG-3170
URL: https://issues.apache.org/jira/browse/PIG-3170
Project: Pig
Issue Type: Bug
Components: impl
Affects Versions: 0.10.0
Reporter: Clément Stenac
Priority: Minor
Through the PigStatusReporter, and the ProgressableReporter, when a Pig MR task
is done, static references are kept to Hadoop's Context object.
Additionally, the PigCombiner also keeps a static reference, apparently without
using it.
When the JVM is reused between MR tasks, it can cause large memory
overconsumption, with a peak during the creation of the next task, because
while MR is creating the next task (in MapTask.<init> for example), we have
both contexts (with their associated buffers) allocated at once.
This problem is especially important when using a Combiner, because the
ReduceContext of a Combiner contains references to large sort buffers.
The specifics of our case were:
* 20 GB input data, divided in 85 map tasks
* Very simple Pig script: LOAD A, FILTER A, GROUP A, FOREACH group generate
MAX(field), STORE
* MapR distribution, which automatically computes Xmx for mappers at 800MB
* At the end of the first task, the ReduceContext contains more than 400MB of
byte[]
* Systematic OOM in MapTask.<init> on subsequent VM reuse
* At least -Xmx1200m was required to get the job to complete
* With attached patch, -Xmx600m is enough
While a workaround by increasing Xmx is possible, I think the large
overconsumption and the complexity of debugging the issue (because the OOM
actually happens at the very beginning of the task, before the first byte of
data has been processed) warrants fixing it.
The attached patch makes sure that PigStatusReporter and ProgressableReporter
drop their reference to the Context in the cleanup phases of the task.
No new test is included because I don't really think it's possible to write a
unit test, the issue being not "binary"
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira