Re: Orc memory issue

Prasanth Jayachandran Fri, 18 Dec 2015 01:53:29 -0800

Hi

Can you retry with hive.optimize.sort.dynamic.partition set to true?


Thanks
Prasanth

On Dec 18, 2015, at 3:48 AM, Hemanth Meka 
<hemanth.m...@datametica.com<mailto:hemanth.m...@datametica.com>> wrote:

Hi All,

We have a source table and a target table. data in source table is text and 
without partitions and target table is a orc table with 5 partition columns and 
6000 records and 1400 partitions.

We are trying to insert overwrite the target table with the data in the source 
table. This insert overwrite command causes an memory exception which is pasted 
below.

It works if we change target table from ORC to TEXT or we keep it as an orc 
table and remove partitions.
It also runs fine for 1000 records

Did anyone face such issue with orc. Any workarounds are much appreciated.

we tried options like:

    ("orc.compress"="SNAPPY");
    ("orc.stripe.size"=67108864);
    ("orc.compress.size"=8192);
    set mapreduce.map.memory.mb=4096;
    set mapreduce.reduce.memory.mb=5120;
    using distribute by and sort by

PS:

decreasing number of parrtition columns to 3 with 65 partition values works fine

Logging initialized using configuration in 
file:/etc/hive/conf/hive-log4j.properties
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/usr/hdp/2.2.6.0-2800/hadoop/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/usr/hdp/2.2.6.0-2800/hive/lib/hive-jdbc-0.14.0.2.2.6.0-2800-standalone.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
Query ID = sindhurak_20151218140101_3616d146-a1e6-4676-8865-de15d8b5b7eb
Total jobs = 1
Launching Job 1 out of 1


Status: Running (Executing on YARN cluster with App id 
application_1449146890024_15817)

--------------------------------------------------------------------------------
        VERTICES      STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  KILLED
--------------------------------------------------------------------------------
Map 1 ..........   SUCCEEDED      1          1        0        0       0       0
Reducer 2             FAILED      1          0        0        1       4       0
--------------------------------------------------------------------------------
VERTICES: 01/02  [=============>>-------------] 50%   ELAPSED TIME: 86.62 s
--------------------------------------------------------------------------------
Status: Failed
Vertex failed, vertexName=Reducer 2, vertexId=vertex_1449146890024_15817_1_01, 
diagnostics=[Task failed, taskId=task_1449146890024_15817_1_01_000000, 
diagnostics=[TaskAttempt 0 failed, info=[Container 
container_e24_1449146890024_15817_01_000002 finished with diagnostics set to 
[Container failed. Exception from container-launch.
Container id: container_e24_1449146890024_15817_01_000002
Exit code: 1
Stack trace: ExitCodeException exitCode=1:
        at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
        at org.apache.hadoop.util.Shell.run(Shell.java:455)
        at 
org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:715)
        at 
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:212)
        at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
        at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)


Container exited with a non-zero exit code 1
]], TaskAttempt 1 failed, info=[Error: Failure while running 
task:java.lang.RuntimeException: java.lang.OutOfMemoryError: Java heap space
        at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:172)
        at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:138)
        at 
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324)
        at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176)
        at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
        at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168)
        at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.OutOfMemoryError: Java heap space
        at 
org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerWriterV2.<init>(RunLengthIntegerWriterV2.java:145)
        at 
org.apache.hadoop.hive.ql.io.orc.WriterImpl$TreeWriter.createIntegerWriter(WriterImpl.java:642)
        at 
org.apache.hadoop.hive.ql.io.orc.WriterImpl$StringTreeWriter.<init>(WriterImpl.java:1058)
        at 
org.apache.hadoop.hive.ql.io.orc.WriterImpl.createTreeWriter(WriterImpl.java:1816)
        at 
org.apache.hadoop.hive.ql.io.orc.WriterImpl.access$1500(WriterImpl.java:98)
        at 
org.apache.hadoop.hive.ql.io.orc.WriterImpl$StructTreeWriter.<init>(WriterImpl.java:1587)
        at 
org.apache.hadoop.hive.ql.io.orc.WriterImpl.createTreeWriter(WriterImpl.java:1841)
        at 
org.apache.hadoop.hive.ql.io.orc.WriterImpl.<init>(WriterImpl.java:195)
        at 
org.apache.hadoop.hive.ql.io.orc.OrcFile.createWriter(OrcFile.java:435)
        at 
org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat$OrcRecordWriter.write(OrcOutputFormat.java:84)
        at 
org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:689)
        at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
        at 
org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
        at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:328)
        at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:218)
        at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:168)
        at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:163)
        ... 13 more
], TaskAttempt 2 failed, info=[Container 
container_e24_1449146890024_15817_01_000004 finished with diagnostics set to 
[Container failed. Exception from container-launch.
Container id: container_e24_1449146890024_15817_01_000004
Exit code: 1
Stack trace: ExitCodeException exitCode=1:
        at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
        at org.apache.hadoop.util.Shell.run(Shell.java:455)
        at 
org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:715)
        at 
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:212)
        at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
        at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)


Container exited with a non-zero exit code 1
]], TaskAttempt 3 failed, info=[Container 
container_e24_1449146890024_15817_01_000005 finished with diagnostics set to 
[Container failed. ]]], Vertex failed as one or more tasks failed. 
failedTasks:1, Vertex vertex_1449146890024_15817_1_01 [Reducer 2] killed/failed 
due to:null]
DAG failed due to vertex failure. failedVertices:1 killedVertices:0
FAILED: Execution Error, return code 2 from 
org.apache.hadoop.hive.ql.exec.tez.TezTask

Re: Orc memory issue

Reply via email to