Optimization (?) of HashSet(Collection)

Сергей Цыпанов Mon, 04 Oct 2021 04:57:45 -0700

Hello,

in the code of HashSet(Collection) we have an optimization opportunity:


public HashSet(Collection<? extends E> c) {
  map = new HashMap<>(Math.max((int) (c.size()/.75f) + 1, 16));
  addAll(c);
}

instead of using addAll() inherited from j.u.Collection we can use 
c.forEach(this::add):

public HashSet(Collection<? extends E> c) {
  map = new HashMap<>(Math.max((int) (c.size()/.75f) + 1, 16));
  c.forEach(this::add);
}

This allows to rid Iterator and use counting loops for e.g. ArrayList instead 
of hasNext()/next().

To me the most common use cases of HashSet(Collection) are those:
- ArrayList/Arrays.asList() is passed to convert List->Set or/and excluded 
duplicates
- HashSet is passed to have a copy of original Set

We are also likely to benefit from this change in case the argument collection 
is synchronized as it's going to be locked only once.

I've used the benchmark [1] to analyze the impact and it demonstrates 
measurable improvement [2].

What puzzles we a bit is the result for j.u.ArrayList. At warm-up time it 
demonstrates better results than at measurement time:

 length = 10

# Run progress: 4,44% complete, ETA 00:21:41
# Fork: 3 of 5
# Warmup Iteration   1: 134,699 ns/op
# Warmup Iteration   2: 115,391 ns/op
# Warmup Iteration   3: 130,110 ns/op
# Warmup Iteration   4: 114,465 ns/op  <----
# Warmup Iteration   5: 114,849 ns/op
# Warmup Iteration   6: 115,073 ns/op
# Warmup Iteration   7: 113,988 ns/op
# Warmup Iteration   8: 115,301 ns/op
# Warmup Iteration   9: 115,452 ns/op
# Warmup Iteration  10: 124,493 ns/op  <----
Iteration   1: 123,776 ns/op
Iteration   2: 123,719 ns/op
Iteration   3: 123,725 ns/op
Iteration   4: 126,892 ns/op
Iteration   5: 125,493 ns/op
Iteration   6: 124,661 ns/op
Iteration   7: 123,783 ns/op
Iteration   8: 123,975 ns/op
Iteration   9: 124,047 ns/op
Iteration  10: 123,899 ns/op

length = 100

# Run progress: 11,11% complete, ETA 00:20:10
# Fork: 1 of 5
# Warmup Iteration   1: 1236,695 ns/op
# Warmup Iteration   2: 1069,909 ns/op
# Warmup Iteration   3: 1243,852 ns/op
# Warmup Iteration   4: 1059,038 ns/op  <----
# Warmup Iteration   5: 1057,670 ns/op
# Warmup Iteration   6: 1057,117 ns/op
# Warmup Iteration   7: 1056,932 ns/op
# Warmup Iteration   8: 1054,909 ns/op
# Warmup Iteration   9: 1058,106 ns/op
# Warmup Iteration  10: 1145,699 ns/op  <----
Iteration   1: 1139,155 ns/op
Iteration   2: 1141,662 ns/op
Iteration   3: 1139,061 ns/op
Iteration   4: 1139,306 ns/op
Iteration   5: 1138,475 ns/op
Iteration   6: 1140,491 ns/op
Iteration   7: 1144,017 ns/op
Iteration   8: 1139,411 ns/op
Iteration   9: 1142,729 ns/op
Iteration  10: 1144,042 ns/op


Here I use 1 s warm-up iteration on 2s measurement iteration. It looks like at 
iteration 4 the code is top-optimized,
and later at the end of warm-up a kind of deoptimization happens. There's still 
visible improvement however.

The benchmark is run with 5 forks, and this deoptimization is visible for 
length = {10, 100}.

So my two question are:
- What is the reason of the deoptimization and can we do something about it?
- Whether suggested changes is worth implementations?

Regards,
Sergey Tsypanov

1.

@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@Fork(jvmArgsAppend = {"-Xms2g", "-Xmx2g"})
public class HashSetBenchmark {

  @Benchmark
  public HashSet<Integer> fromArrayList(Data data) {
    return new HashSet<>(data.arrayList);
  }

  @Benchmark
  public HashSet<Integer> fromHashSet(Data data) {
    return new HashSet<>(data.hashSet);
  }

  @Benchmark
  public HashSet<Integer> fromArraysAsList(Data data) {
    return new HashSet<>(data.arraysAsList);
  }

  @State(Scope.Thread)
  public static class Data {
    private ArrayList<Integer> arrayList;
    private HashSet<Integer> hashSet;
    private List<Integer> arraysAsList;

    @Param({"10", "100", "1000"})
    private int length;

    @Setup
    public void setup() throws IOException {
      var array = new Integer[length];
      for (int i = 0; i < length; i++) {
        array[i] = i;
      }
      arraysAsList = Arrays.asList(array);
      arrayList = new ArrayList<>(arraysAsList);
      hashSet = new HashSet<>(arraysAsList);
    }
  }
}

2.

baseline

Benchmark                          (length)  Mode  Cnt      Score     Error  
Units
HashSetBenchmark.fromArrayList           10  avgt   50    127,838 ±   1,822  
ns/op
HashSetBenchmark.fromArrayList          100  avgt   50   1209,543 ±   8,435  
ns/op
HashSetBenchmark.fromArrayList         1000  avgt   50  11915,334 ± 155,908  
ns/op
HashSetBenchmark.fromArraysAsList        10  avgt   50    130,049 ±   1,661  
ns/op
HashSetBenchmark.fromArraysAsList       100  avgt   50   1138,407 ±  19,998  
ns/op
HashSetBenchmark.fromArraysAsList      1000  avgt   50  11345,099 ±  83,086  
ns/op
HashSetBenchmark.fromHashSet             10  avgt   50    184,559 ±   1,387  
ns/op
HashSetBenchmark.fromHashSet            100  avgt   50   2001,577 ±  42,321  
ns/op
HashSetBenchmark.fromHashSet           1000  avgt   50  17535,017 ± 199,892  
ns/op

patched

Benchmark                          (length)  Mode  Cnt      Score     Error  
Units
HashSetBenchmark.fromArrayList           10  avgt   50    125,075 ±   0,907  
ns/op
HashSetBenchmark.fromArrayList          100  avgt   50   1127,378 ±  19,368  
ns/op
HashSetBenchmark.fromArrayList         1000  avgt   50  11455,172 ±  61,543  
ns/op
HashSetBenchmark.fromArraysAsList        10  avgt   50    117,634 ±   3,641  
ns/op
HashSetBenchmark.fromArraysAsList       100  avgt   50   1034,538 ±  11,958  
ns/op
HashSetBenchmark.fromArraysAsList      1000  avgt   50  10786,914 ± 137,330  
ns/op
HashSetBenchmark.fromHashSet             10  avgt   50    173,727 ±   2,177  
ns/op
HashSetBenchmark.fromHashSet            100  avgt   50   1748,858 ±  12,535  
ns/op
HashSetBenchmark.fromHashSet           1000  avgt   50  16474,208 ±  97,578  
ns/op

Optimization (?) of HashSet(Collection)

Reply via email to