So it's filling up on the emitting stage, so I need to look at the task logs and or my script that's printing to stdout as the likely culprits I am guessing.
On Wed, Jan 30, 2013 at 9:11 AM, Philip Tromans <philip.j.trom...@gmail.com>wrote: > That particular OutOfMemoryError is happening on one of your hadoop nodes. > It's the heap within the process forked by the hadoop tasktracker, I think. > > Phil. > > > On 30 January 2013 14:28, John Omernik <j...@omernik.com> wrote: > >> So just a follow-up. I am less looking for specific troubleshooting on >> how to fix my problem, and more looking for a general understanding of heap >> space usage with Hive. When I get an error like this, is it heap space on >> a node, or heap space on my hive server? Is it the heap space of the >> tasktracker? Heap of the job kicked off on the node? Which heap is being >> affected? If it's not clear in my output, where can I better understand >> this? I am sorely out of my league here when it comes to understanding the >> JVM interactions of Hive and Hadoop, i.e. where hive is run, vs where task >> trackers are run etc. >> >> Thanks is advance! >> >> >> >> On Tue, Jan 29, 2013 at 7:43 AM, John Omernik <j...@omernik.com> wrote: >> >>> I am running a transform script that parses through a bunch of binary >>> data. In 99% of the cases it runs, it runs fine, but on certain files I get >>> a failure (as seen below). Funny thing is, I can run a job with "only" the >>> problem source file, and it will work fine, but when as a group of files, I >>> get these warnings. I guess what I am asking here is this: Where is the >>> heap error? Is this occurring on the nodes themselves or, since this is >>> where the script is emitting records (and potentially large ones at that) >>> and in this case my hive server running the job may be memory light, could >>> the issue actually be due to heap on the hive server itself? My setup is >>> 1 Hive node (that is woefully underpowered, under memoried, and under disk >>> I/Oed) and 4 beefy hadoop nodes. I guess, my question is the heap issue on >>> the sender or the receiver :) >>> >>> >>> >>> >>> 13-01-29 08:20:24,107 INFO org.apache.hadoop.hive.ql.io.CodecPool: Got >>> brand-new compressor >>> 2013-01-29 08:20:24,107 INFO >>> org.apache.hadoop.hive.ql.exec.SelectOperator: 12 forwarding 1 rows >>> 2013-01-29 08:20:24,410 INFO >>> org.apache.hadoop.hive.ql.exec.ScriptOperator: 3 forwarding 10 rows >>> 2013-01-29 08:20:24,410 INFO >>> org.apache.hadoop.hive.ql.exec.SelectOperator: 4 forwarding 10 rows >>> 2013-01-29 08:20:24,411 INFO >>> org.apache.hadoop.hive.ql.exec.SelectOperator: 5 forwarding 10 rows >>> 2013-01-29 08:20:24,411 INFO >>> org.apache.hadoop.hive.ql.exec.SelectOperator: 6 forwarding 10 rows >>> 2013-01-29 08:20:24,411 INFO >>> org.apache.hadoop.hive.ql.exec.FilterOperator: 8 forwarding 10 rows >>> 2013-01-29 08:20:24,411 INFO >>> org.apache.hadoop.hive.ql.exec.SelectOperator: 9 forwarding 10 rows >>> 2013-01-29 08:20:24,411 INFO >>> org.apache.hadoop.hive.ql.exec.SelectOperator: 10 forwarding 10 rows >>> 2013-01-29 08:20:24,412 INFO >>> org.apache.hadoop.hive.ql.exec.SelectOperator: 12 forwarding 10 rows >>> 2013-01-29 08:20:27,170 INFO >>> org.apache.hadoop.hive.ql.exec.ScriptOperator: 3 forwarding 100 rows >>> 2013-01-29 08:20:27,170 INFO >>> org.apache.hadoop.hive.ql.exec.SelectOperator: 4 forwarding 100 rows >>> 2013-01-29 08:20:27,170 INFO >>> org.apache.hadoop.hive.ql.exec.SelectOperator: 5 forwarding 100 rows >>> 2013-01-29 08:20:27,171 INFO >>> org.apache.hadoop.hive.ql.exec.SelectOperator: 6 forwarding 100 rows >>> 2013-01-29 08:20:27,171 INFO >>> org.apache.hadoop.hive.ql.exec.FilterOperator: 8 forwarding 100 rows >>> 2013-01-29 08:20:27,171 INFO >>> org.apache.hadoop.hive.ql.exec.SelectOperator: 9 forwarding 100 rows >>> 2013-01-29 08:20:27,171 INFO >>> org.apache.hadoop.hive.ql.exec.SelectOperator: 10 forwarding 100 rows >>> 2013-01-29 08:20:27,171 INFO >>> org.apache.hadoop.hive.ql.exec.SelectOperator: 12 forwarding 100 rows >>> 2013-01-29 08:21:16,247 INFO >>> org.apache.hadoop.hive.ql.exec.ScriptOperator: 3 forwarding 1000 rows >>> 2013-01-29 08:21:16,247 INFO >>> org.apache.hadoop.hive.ql.exec.SelectOperator: 4 forwarding 1000 rows >>> 2013-01-29 08:21:16,247 INFO >>> org.apache.hadoop.hive.ql.exec.SelectOperator: 5 forwarding 1000 rows >>> 2013-01-29 08:21:16,247 INFO >>> org.apache.hadoop.hive.ql.exec.SelectOperator: 6 forwarding 1000 rows >>> 2013-01-29 08:21:16,248 INFO >>> org.apache.hadoop.hive.ql.exec.FilterOperator: 8 forwarding 1000 rows >>> 2013-01-29 08:21:16,248 INFO >>> org.apache.hadoop.hive.ql.exec.SelectOperator: 9 forwarding 1000 rows >>> 2013-01-29 08:21:16,248 INFO >>> org.apache.hadoop.hive.ql.exec.SelectOperator: 10 forwarding 1000 rows >>> 2013-01-29 08:21:16,248 INFO >>> org.apache.hadoop.hive.ql.exec.SelectOperator: 12 forwarding 1000 rows >>> 2013-01-29 08:25:47,532 INFO >>> org.apache.hadoop.hive.ql.exec.ScriptOperator: 3 forwarding 10000 rows >>> 2013-01-29 08:25:47,532 INFO >>> org.apache.hadoop.hive.ql.exec.SelectOperator: 4 forwarding 10000 rows >>> 2013-01-29 08:25:47,532 INFO >>> org.apache.hadoop.hive.ql.exec.SelectOperator: 5 forwarding 10000 rows >>> 2013-01-29 08:25:47,532 INFO >>> org.apache.hadoop.hive.ql.exec.SelectOperator: 6 forwarding 10000 rows >>> 2013-01-29 08:25:47,532 INFO >>> org.apache.hadoop.hive.ql.exec.FilterOperator: 8 forwarding 10000 rows >>> 2013-01-29 08:25:47,532 INFO >>> org.apache.hadoop.hive.ql.exec.SelectOperator: 9 forwarding 10000 rows >>> 2013-01-29 08:25:47,532 INFO >>> org.apache.hadoop.hive.ql.exec.SelectOperator: 10 forwarding 10000 rows >>> 2013-01-29 08:25:47,532 INFO >>> org.apache.hadoop.hive.ql.exec.SelectOperator: 12 forwarding 10000 rows >>> 2013-01-29 08:27:34,276 WARN >>> org.apache.hadoop.hive.ql.exec.ScriptOperator: Exception in >>> StreamThread.run(): Java heap space >>> Cause: null >>> 2013-01-29 08:27:34,277 WARN >>> org.apache.hadoop.hive.ql.exec.ScriptOperator: java.lang.OutOfMemoryError: >>> Java heap space >>> at java.util.Arrays.copyOfRange(Arrays.java:3209) >>> at java.lang.String.<init>(String.java:215) >>> at java.nio.HeapCharBuffer.toString(HeapCharBuffer.java:542) >>> at java.nio.CharBuffer.toString(CharBuffer.java:1157) >>> at org.apache.hadoop.io.Text.decode(Text.java:350) >>> at org.apache.hadoop.io.Text.decode(Text.java:327) >>> at org.apache.hadoop.io.Text.toString(Text.java:254) >>> at java.lang.String.valueOf(String.java:2826) >>> at java.lang.StringBuilder.append(StringBuilder.java:115) >>> at >>> org.apache.hadoop.hive.ql.exec.FunctionRegistry.invoke(FunctionRegistry.java:873) >>> at >>> org.apache.hadoop.hive.ql.udf.generic.GenericUDFBridge.evaluate(GenericUDFBridge.java:181) >>> at >>> org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator.evaluate(ExprNodeGenericFuncEvaluator.java:163) >>> at >>> org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:76) >>> at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471) >>> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:762) >>> at >>> org.apache.hadoop.hive.ql.exec.ScriptOperator$OutputStreamProcessor.processLine(ScriptOperator.java:477) >>> at >>> org.apache.hadoop.hive.ql.exec.ScriptOperator$StreamThread.run(ScriptOperator.java:563) >>> >>> 2013-01-29 08:27:34,306 INFO >>> org.apache.hadoop.hive.ql.exec.ScriptOperator: ErrorStreamProcessor calling >>> reporter.progress() >>> 2013-01-29 08:27:34,307 INFO >>> org.apache.hadoop.hive.ql.exec.ScriptOperator: StreamThread ErrorProcessor >>> done >>> 2013-01-29 08:27:34,307 ERROR >>> org.apache.hadoop.hive.ql.exec.ScriptOperator: Script failed with code 1 >>> >> >> >