[ https://issues.apache.org/jira/browse/HIVE-15105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15645476#comment-15645476 ]
Prasanth Jayachandran commented on HIVE-15105: ---------------------------------------------- HIVE-11751 should fix this. > Hive shell runs out of memory on Tez > ------------------------------------ > > Key: HIVE-15105 > URL: https://issues.apache.org/jira/browse/HIVE-15105 > Project: Hive > Issue Type: Bug > Components: Tez > Affects Versions: 2.0.1 > Reporter: Premal Shah > > Hive 2.0.1 > Hadoop 2.7.2 > Tex 0.8.4 > We have a UDF in hive which take in some values and outputs a score. When > running a query on a table which calls the score function on every row, looks > like tez is not running the query on YARN, but trying to run it in local > mode. It then runs out of memory trying to insert that data into a table. > Here's the query > {noformat} > ADD JAR score.jar; > CREATE TEMPORARY FUNCTION score AS 'hive.udf.ScoreUDF'; > CREATE TABLE abc AS > SELECT > id, > score(col1, col2) as score > , '2016-10-11' AS dt > FROM input_table > ; > {noformat} > Here's the output of the shell > {noformat} > Query ID = hadoop_20161028232841_5a06db96-ffaa-4e75-a657-c7cb46ccb3f5 > Total jobs = 1 > Launching Job 1 out of 1 > java.lang.OutOfMemoryError: Java heap space > at java.util.Arrays.copyOf(Arrays.java:3332) > at > java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:137) > at > java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:121) > at > java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:622) > at java.lang.StringBuilder.append(StringBuilder.java:202) > at com.google.protobuf.TextFormat.escapeBytes(TextFormat.java:1283) > at > com.google.protobuf.TextFormat$Printer.printFieldValue(TextFormat.java:394) > at > com.google.protobuf.TextFormat$Printer.printSingleField(TextFormat.java:327) > at > com.google.protobuf.TextFormat$Printer.printField(TextFormat.java:286) > at com.google.protobuf.TextFormat$Printer.print(TextFormat.java:273) > at > com.google.protobuf.TextFormat$Printer.printFieldValue(TextFormat.java:404) > at > com.google.protobuf.TextFormat$Printer.printSingleField(TextFormat.java:327) > at > com.google.protobuf.TextFormat$Printer.printField(TextFormat.java:286) > at com.google.protobuf.TextFormat$Printer.print(TextFormat.java:273) > at > com.google.protobuf.TextFormat$Printer.printFieldValue(TextFormat.java:404) > at > com.google.protobuf.TextFormat$Printer.printSingleField(TextFormat.java:327) > at > com.google.protobuf.TextFormat$Printer.printField(TextFormat.java:286) > at com.google.protobuf.TextFormat$Printer.print(TextFormat.java:273) > at > com.google.protobuf.TextFormat$Printer.printFieldValue(TextFormat.java:404) > at > com.google.protobuf.TextFormat$Printer.printSingleField(TextFormat.java:327) > at > com.google.protobuf.TextFormat$Printer.printField(TextFormat.java:283) > at com.google.protobuf.TextFormat$Printer.print(TextFormat.java:273) > at > com.google.protobuf.TextFormat$Printer.printFieldValue(TextFormat.java:404) > at > com.google.protobuf.TextFormat$Printer.printSingleField(TextFormat.java:327) > at > com.google.protobuf.TextFormat$Printer.printField(TextFormat.java:283) > at com.google.protobuf.TextFormat$Printer.print(TextFormat.java:273) > at > com.google.protobuf.TextFormat$Printer.printFieldValue(TextFormat.java:404) > at > com.google.protobuf.TextFormat$Printer.printSingleField(TextFormat.java:327) > at > com.google.protobuf.TextFormat$Printer.printField(TextFormat.java:286) > at com.google.protobuf.TextFormat$Printer.print(TextFormat.java:273) > at > com.google.protobuf.TextFormat$Printer.access$400(TextFormat.java:248) > at com.google.protobuf.TextFormat.shortDebugString(TextFormat.java:88) > FAILED: Execution Error, return code -101 from > org.apache.hadoop.hive.ql.exec.tez.TezTask. Java heap space > {noformat} > It looks like the job is not getting submitted to the cluster, but running > locally. We can't get tez to run the query on the cluster. > The hive shell starts with an Xmx of 4G. > If I set hive.execution.engine = mr, then the query works, because it runs on > the hadoop cluster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)