Patrick, Can you bump up the mapper memory and see if it helps?
SET mapreduce.map.memory.mb=3072 SET mapreduce.map.java.opts=-Xmx2560m; Regards, Venkatesh On Mon, Mar 11, 2019 at 7:29 AM Patrick Duin <patd...@gmail.com> wrote: > Hi, > > I'm running into oom issue trying to do a Union all on a bunch of AVRO > files. > > The query is something like this: > > with gold as ( select * from table1 where local_date=2019-01-01), > delta ss ( select * from table2 where local_date=2019-01-01) > insert overwrite table3 PARTITION ('local_date') > select * from gold > union distinct > select * from delta; > > UNION ALL works. The data size is in the low gigabytes and I'm running on > 6 16 GB Nodes (I've tried larger and set memory settings higher but that > just postpones the error). > > Mappers fail with erros (stacktraces not all the same) > > 2019-03-11 13:37:22,381 ERROR [main] org.apache.hadoop.mapred.YarnChild: > Error running child : java.lang.OutOfMemoryError: GC overhead limit exceeded > at org.apache.hadoop.io.Text.setCapacity(Text.java:268) > at org.apache.hadoop.io.Text.set(Text.java:224) > at org.apache.hadoop.io.Text.set(Text.java:214) > at org.apache.hadoop.io.Text.<init>(Text.java:93) > at > org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableStringObjectInspector.copyObject(WritableStringObjectInspector.java:36) > at > org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.copyToStandardObject(ObjectInspectorUtils.java:418) > at > org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.copyToStandardObject(ObjectInspectorUtils.java:442) > at > org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.copyToStandardObject(ObjectInspectorUtils.java:428) > at > org.apache.hadoop.hive.ql.exec.KeyWrapperFactory$ListKeyWrapper.deepCopyElements(KeyWrapperFactory.java:152) > at > org.apache.hadoop.hive.ql.exec.KeyWrapperFactory$ListKeyWrapper.deepCopyElements(KeyWrapperFactory.java:144) > at > org.apache.hadoop.hive.ql.exec.KeyWrapperFactory$ListKeyWrapper.copyKey(KeyWrapperFactory.java:121) > at > org.apache.hadoop.hive.ql.exec.GroupByOperator.processHashAggr(GroupByOperator.java:805) > at > org.apache.hadoop.hive.ql.exec.GroupByOperator.processKey(GroupByOperator.java:719) > at > org.apache.hadoop.hive.ql.exec.GroupByOperator.process(GroupByOperator.java:787) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:897) > at > org.apache.hadoop.hive.ql.exec.UnionOperator.process(UnionOperator.java:148) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:897) > at > org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:897) > at > org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:130) > at > org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:148) > at > org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:547) > at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:160) > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:455) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:344) > at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:175) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1844) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:169) > > I've tried Hive 2.3.2 and Hive 2.3.4, both tez and mr engines. > > I've tried running with more and less mappers, always hitting oom. > > I'm running similar query on different (much larger) data without issues so > suspect something with the actual data. > > The table schema is this: > c1 string > c2 bigint > c3 array<map<string,string>> > local_date string > > > I've narrowed it down and (not surprisingly) the 3rd column seems to be the > cause of the issue, If I remove that the union works again just fine. > > Anyone has similar experiences? Perhaps any pointers on how to tackle this? > > Kind regards, > > Patrick > > > -- Regards, Venkatesh Selvaraj