Hi, > Caused by: java.lang.ArrayIndexOutOfBoundsException > at > org.apache.hadoop.mapred.MapTask$MapOutputBuffer$Buffer.write(MapTask.java:1453)
In general HDP specific issues tend to get more attention on HCC, but this is a pretty old issue stemming from MapReduce being designed for fairly low-memory JVMs. The io.sort.mb size is the reason for this crash, it has a wrap-around case where sort buffers which are > 1Gb trigger a corner case. As odd as this might sound, if you have fewer splits the sort buffer wouldn't wrap around enough times to generate a -ve offset. You can lower the mapreduce.task.io.sort.mb to 1024Mb or lower as a slower workaround. I ran into this issue in 2013 and started working on optimizing sort for larger buffers for MapReduce (MAPREDUCE-4755), but ended up rewriting the entire thing & then added it to Tez. Cheers, Gopal