Glad that presentation was useful to you :)
hive.exec.orc.memory.pool is the fraction of memory that ORC writers are
allowed to use. If your heap size is 1GB and if the hive.exec.orc.memory.pool
is set to 0.5 then ORC writers can use maximum of 500MB memory. If there are
more ORC writers and if
Prasanth -
This is easily the best and most complete explanation I've received to any
online posted question ever. I know that sounds like a an overstatement,
but this answer is awesome. :) I really appreciate your insight on this.
My only follow-up is asking how the memory.pool percentage pla
So one more follow-up:
The 16-.25-Success turns to a fail if I throw more data (and hence more
partitions) at the problem. Could there be some sort of issue that rears
it's head based on the number of output dynamic partitions?
Thanks all!
On Sun, Apr 27, 2014 at 3:33 PM, John Omernik wrote:
Here is some testing, I focused on two variables (Not really understanding
what they do)
orc.compress.size (256k by default)
hive.exec.orc.memory.pool (0.50 by default).
The job I am running is a admittedly complex job running through a Python
Transform script. However, as noted above, RCFile wri
Hello all,
I am working with Hive 0.12 right now on YARN. When I am writing a table
that is admittedly quite "wide" (there are lots of columns, near 60,
including one binary field that can get quite large). Some tasks will
fail on ORC file write with Java Heap Space Issues.
I have confirmed th