Maybe we should discuss whether the elements of array can be larger than 67108864 in our use cases - e.g. FairScheduler uses Collection.sort(), but the number of job isn't larger than 67108864 in many use cases, so we can keep using it. It's also reasonable that we choose to use safe algorithms for stability.
Thanks, - Tsuyoshi On Thu, Feb 26, 2015 at 5:04 PM, Tsuyoshi Ozawa <oz...@apache.org> wrote: > Hi hadoop developers, > > Last 2 weeks, a bug of JDK about TimSort, related to Collections#sort, > is reported. How can we deal with this problem? > > http://envisage-project.eu/timsort-specification-and-verification/ > https://bugs.openjdk.java.net/browse/JDK-8072909 > > The bug causes ArrayIndexOutOfBoundsException if the number of element > is larger than 67108864. > > We use the sort method at 77 places at least. > find . -name "*.java" | xargs grep "Collections.sort" | wc -l > 77 > > One reasonable workaround is to set > java.util.Arrays.useLegacyMergeSort() by default. > > Thanks, > - Tsuyoshi