I guess that the cost for an array element access is more expensive than accessing a field, regardless of volatile. The offset is computed at runtime for an array element and might need to be checked to be in the range of valid indices, while the offset of a field is known and can be a constant in the instruction stream.

Different access costs probably prevent a direct comparison of the ratios non-hoisted/hoisted.

Also, you would probably have to look at the generated code.

For example, in the hoisted field case the loop for the sum could be replaced by a multiplication value * count.




On 2023-08-16 16:06, Сергей Цыпанов wrote:
I meant relation between in-loop and hoisted access.
In Java 19 when we take count = 100 for volatile array and hoist it from the loop then the average time decreases from 146 to 33 ns. And if we take the same count for "plain" field and hoist it from the loop then the average time decreases from 98 to 7 ns.


If I read the data correctly, for the count=100 case in jdk 20 it takes
109 ns/op for the array and 74 ns/op for the field.

To me this looks like a field access is _less_ expensive.

Am I missing something?

On 2023-08-16 13:37, Сергей Цыпанов wrote:

Hello,

I was measuring costs of hoisting volatile access out of the loop and found out, that 
there's a difference in numbers for arrays and "plain" references.

Here's the benchmark for array:

@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@Warmup(time = 2, iterations = 5)
@Measurement(time = 2, iterations = 5)
@Fork(value = 4, jvmArgs = "-Xmx1g")
public class VolatileArrayInLoopBenchmark {

@Benchmark
public int accessVolatileInLoop(Data data) {
int sum = 0;
for (int i = 0; i < data.count; i++) {
sum += data.ints[i];
}
return sum;
}

@Benchmark
public int hoistVolatileFromLoop(Data data) {
int sum = 0;
int[] ints = data.ints;
for (int i = 0; i < data.count; i++) {
sum += ints[i];
}
return sum;
}

@State(Scope.Benchmark)
public static class Data {
@Param({"1", "10", "100"})
private int count;
private volatile int[] ints;

@Setup
public void setUp() {
int[] ints = new int[count];
for (int i = 0; i < ints.length; i++) {
ints[i] = ThreadLocalRandom.current().nextInt();
}
this.ints = ints;
}
}
}

and this one is for reference:

@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@Warmup(time = 2, iterations = 5)
@Measurement(time = 2, iterations = 5)
@Fork(value = 4, jvmArgs = "-Xmx1g")
public class VolatileFieldInLoopBenchmark {

@Benchmark
public int accessVolatileInLoop(Data data) {
int sum = 0;
for (int i = 0; i < data.count; i++) {
sum += data.value;
}
return sum;
}

@Benchmark
public int hoistVolatileFromLoop(Data data) {
int sum = 0;
int value = data.value;
for (int i = 0; i < data.count; i++) {
sum += value;
}
return sum;
}

@State(Scope.Benchmark)
public static class Data {
private final ThreadLocalRandom random = ThreadLocalRandom.current();

private volatile int value = random.nextInt();

@Param({"1", "10", "100"})
private int count;
}
}

 From measurement results it looks like volatile array access is cheaper than 
"plain" reference access:

Java 19

Benchmark (count) Mode Cnt Score Error Units
VolatileArrayInLoopBenchmark.accessVolatileInLoop 1 avgt 20 2.110 ± 0.404 ns/op
VolatileArrayInLoopBenchmark.accessVolatileInLoop 10 avgt 20 14.836 ± 2.825 
ns/op
VolatileArrayInLoopBenchmark.accessVolatileInLoop 100 avgt 20 146.497 ± 25.786 
ns/op
VolatileArrayInLoopBenchmark.hoistVolatileFromLoop 1 avgt 20 3.006 ± 0.686 ns/op
VolatileArrayInLoopBenchmark.hoistVolatileFromLoop 10 avgt 20 6.222 ± 1.215 
ns/op
VolatileArrayInLoopBenchmark.hoistVolatileFromLoop 100 avgt 20 33.262 ± 6.579 
ns/op

VolatileFieldInLoopBenchmark.accessVolatileInLoop 1 avgt 20 1.823 ± 0.382 ns/op
VolatileFieldInLoopBenchmark.accessVolatileInLoop 10 avgt 20 10.259 ± 2.874 
ns/op
VolatileFieldInLoopBenchmark.accessVolatileInLoop 100 avgt 20 98.648 ± 18.500 
ns/op
VolatileFieldInLoopBenchmark.hoistVolatileFromLoop 1 avgt 20 2.189 ± 0.412 ns/op
VolatileFieldInLoopBenchmark.hoistVolatileFromLoop 10 avgt 20 4.734 ± 0.891 
ns/op
VolatileFieldInLoopBenchmark.hoistVolatileFromLoop 100 avgt 20 7.126 ± 1.309 
ns/op

Java 20

Benchmark (count) Mode Cnt Score Error Units
VolatileArrayInLoopBenchmark.accessVolatileInLoop 1 avgt 20 1.714 ± 0.066 ns/op
VolatileArrayInLoopBenchmark.accessVolatileInLoop 10 avgt 20 10.703 ± 0.148 
ns/op
VolatileArrayInLoopBenchmark.accessVolatileInLoop 100 avgt 20 109.001 ± 1.866 
ns/op
VolatileArrayInLoopBenchmark.hoistVolatileFromLoop 1 avgt 20 2.408 ± 0.224 ns/op
VolatileArrayInLoopBenchmark.hoistVolatileFromLoop 10 avgt 20 4.678 ± 0.060 
ns/op
VolatileArrayInLoopBenchmark.hoistVolatileFromLoop 100 avgt 20 24.711 ± 1.091 
ns/op

VolatileFieldInLoopBenchmark.accessVolatileInLoop 1 avgt 20 1.366 ± 0.105 ns/op
VolatileFieldInLoopBenchmark.accessVolatileInLoop 10 avgt 20 7.388 ± 0.119 ns/op
VolatileFieldInLoopBenchmark.accessVolatileInLoop 100 avgt 20 74.630 ± 1.163 
ns/op
VolatileFieldInLoopBenchmark.hoistVolatileFromLoop 1 avgt 20 1.653 ± 0.035 ns/op
VolatileFieldInLoopBenchmark.hoistVolatileFromLoop 10 avgt 20 3.138 ± 0.040 
ns/op
VolatileFieldInLoopBenchmark.hoistVolatileFromLoop 100 avgt 20 4.945 ± 0.177 
ns/op

So my question is why is volatile reference access is relatively more expensive 
than volatile array access?

Reply via email to