Hi,

I'm running into oom issue trying to do a Union all on a bunch of AVRO
files.

The query is something like this:

with gold  as ( select * from table1 where local_date=2019-01-01),
     delta ss ( select * from table2 where local_date=2019-01-01)
insert overwrite table3 PARTITION ('local_date')
select * from gold
union distinct
select * from delta;

UNION ALL works. The data size is in the low gigabytes and I'm running on 6
16 GB Nodes (I've tried larger and set memory settings higher but that just
postpones the error).

Mappers fail with erros (stacktraces not all the same)

2019-03-11 13:37:22,381 ERROR [main]
org.apache.hadoop.mapred.YarnChild: Error running child :
java.lang.OutOfMemoryError: GC overhead limit exceeded
        at org.apache.hadoop.io.Text.setCapacity(Text.java:268)
        at org.apache.hadoop.io.Text.set(Text.java:224)
        at org.apache.hadoop.io.Text.set(Text.java:214)
        at org.apache.hadoop.io.Text.<init>(Text.java:93)
        at 
org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableStringObjectInspector.copyObject(WritableStringObjectInspector.java:36)
        at 
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.copyToStandardObject(ObjectInspectorUtils.java:418)
        at 
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.copyToStandardObject(ObjectInspectorUtils.java:442)
        at 
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.copyToStandardObject(ObjectInspectorUtils.java:428)
        at 
org.apache.hadoop.hive.ql.exec.KeyWrapperFactory$ListKeyWrapper.deepCopyElements(KeyWrapperFactory.java:152)
        at 
org.apache.hadoop.hive.ql.exec.KeyWrapperFactory$ListKeyWrapper.deepCopyElements(KeyWrapperFactory.java:144)
        at 
org.apache.hadoop.hive.ql.exec.KeyWrapperFactory$ListKeyWrapper.copyKey(KeyWrapperFactory.java:121)
        at 
org.apache.hadoop.hive.ql.exec.GroupByOperator.processHashAggr(GroupByOperator.java:805)
        at 
org.apache.hadoop.hive.ql.exec.GroupByOperator.processKey(GroupByOperator.java:719)
        at 
org.apache.hadoop.hive.ql.exec.GroupByOperator.process(GroupByOperator.java:787)
        at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:897)
        at 
org.apache.hadoop.hive.ql.exec.UnionOperator.process(UnionOperator.java:148)
        at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:897)
        at 
org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)
        at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:897)
        at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:130)
        at 
org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:148)
        at 
org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:547)
        at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:160)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:455)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:344)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:175)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1844)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:169)

I've tried Hive 2.3.2 and Hive 2.3.4, both tez and mr engines.

I've tried running with more and less mappers, always hitting oom.

I'm running similar query on different (much larger) data without
issues so suspect something with the actual data.

The table schema is this:
c1    string                    
c2    bigint                    
c3    array<map<string,string>> 
local_date  string


I've narrowed it down and (not surprisingly) the 3rd column seems to
be the cause of the issue, If I remove that the union works again just
fine.

Anyone has similar experiences? Perhaps any pointers on how to tackle this?

Kind regards,

 Patrick

Reply via email to