I get the "GC Overhead Limit Exceeded" and "java heap space" errors when trying to insert from Stage table to wide (~2500 columns) ORC partitioned table with dynamic partitioning. It consistently fails with TEZ and but succeeds with lesser data in "MR" execution engine. We have been trying to adjust the below settings with MR, but it still fails if there is data for more than 1 day.
set mapreduce.map.memory.mb=16288; set mapreduce.map.java.opts=-Xmx10192m; set mapreduce.reduce.memory.mb=16288; set mapreduce.reduce.java.opts=-Xmx10192m; The insert statement is with very simple select statement as below. INSERT INTO TABLE table_partitioned (partition_key) SELECT 999999 ,* ,IF(Date<>'None', SUBSTR(Date,1,10), '1970-01-01') AS partition_key FROM stg_table The ORC tale uses Snappy compression. This cluster does not have anything else running. Are there settings to suppress GC / memory footprint for this Insert, so that it may succeed with more data? Thanks, Revin