On Jun 18, 2009, at 8:45 AM, Leon Mergen wrote:
Could you perhaps elaborate on that 100 MB limit ? Is that due to a
limit that is caused by the Java VM heap size ? If so, could that,
for example, be increased to 512MB by setting mapred.child.java.opts
to '-Xmx512m' ?
A couple of points:
1. The 100MB was just for ballpark calculations. Of course if you
have a large heap, you can fit larger values. Don't forget that the
framework is allocating big chunks of the heap for its own buffers,
when figuring out how big to make your heaps.
2. Having large keys is much harder than large values. When doing a
N-way merge, the framework has N+1 keys and 1 value in memory at a time.
-- Owen