Hi Ethan,

IIRC Valhalla still doesn't have reified nor specialized generics in any way, 
so anything generic, like List<Whatever> or Optional<Whatever>, is erased to 
non-generic form. The layout of generic classes is not specialized to the 
generic parameter, but instead the instances of 'Whatever' are unconditionally 
boxed. I think that the first step to get performance closer to C++ with 
current Valhalla state would be to avoid all generics in hot execution paths 
and then redo the experiments.

Regards,
Piotr
________________________________
Od: valhalla-dev <[email protected]> w imieniu użytkownika Ethan 
McCue <[email protected]>
Wysłane: czwartek, 30 października 2025 18:21
Do: Sergey Kuksenko <[email protected]>
DW: [email protected] <[email protected]>
Temat: Re: Raytracing Experience Report

Continuing from this, I ran it against the reference C++ implementation and got 
these numbers.

# Reference C++ implementation (-O3)

```
real 6m35.702s
user 6m33.780s
sys 0m1.454s
```

# Java With Value Classes

```
real 11m50.122s
user 11m36.536s
sys 0m13.281s
```

# Java Without Value Classes

```
real 17m1.038s
user 16m40.993s
sys 0m29.400s
```

I am wondering if using an AOT cache could help catch up to the C++, but I get 
a class file version error running with -XX:AOTCache=value.aot

Error: LinkageError occurred while loading main class Main
        java.lang.UnsupportedClassVersionError: Main has been compiled by a 
more recent version of the Java Runtime (class file version 70.0), this version 
of the Java Runtime only recognizes class file versions up to 69.0

On Wed, Oct 29, 2025 at 3:44 PM Sergey Kuksenko 
<[email protected]<mailto:[email protected]>> wrote:
Hi Ethan,

Thank you for the information. Your example and the code are pretty 
straightforward, and I was able to repeat and diagnose the issue.

The fact is, the performance issue is not directly related to value classes. 
The problem is that HittableList::hit method (invoked at Camera::rayColor) was 
inlined by JIT in the non-value version and wasn't inlined in the value classes 
version.
When you inline that invocation manually, you should get the same performance 
for both versions.
HittableList::hit was not inlined in the value classes version because value 
classes resulted in a different code size and changed the inline heuristics. 
It's a mainline issue; you'll encounter it quite rarely. Current inline 
heuristics work well in 99% of cases, and you should be very lucky (or unlucky) 
to get it in real life.

Best regards,
Sergey Kuksenko



________________________________________
From: valhalla-dev 
<[email protected]<mailto:[email protected]>> on behalf 
of Ethan McCue <[email protected]<mailto:[email protected]>>
Sent: Monday, October 27, 2025 5:08 PM
To: [email protected]<mailto:[email protected]>
Subject: Raytracing Experience Report

Hi all,

I have been following along in the "Ray Tracing in a Weekend" book and trying 
to make as many classes as possible value classes. (Vec3, Ray, etc.)

https://github.com/bowbahdoe/raytracer

https://raytracing.github.io/books/RayTracingInOneWeekend.html

(without value classes)

time java --enable-preview --class-path build/classes Main > image.ppm

real 4m33.190s
user 4m28.984s
sys 0m5.511s

(with value classes)

time java --enable-preview --class-path build/classes Main > image.ppm

real 3m54.623s
user 3m52.205s
sys 0m2.064s

So by the end the version using value classes beats the version without them by 
~14% using unscientific measurements.

But that is at the end, running the ray tracer on a relatively large scene with 
all the features turned on. Before that point there were some checkpoints where 
using value classes performed noticeably worse than the equivalent code sans 
the value modifier

https://github.com/bowbahdoe/raytracer/tree/no-value-faster

real 1m22.172s
user 1m9.871s
sys 0m12.951s

https://github.com/bowbahdoe/raytracer/tree/with-value-slower

real 3m34.440s
user 3m19.656s
sys 0m14.870s

So for some reason just adding value to the records/classes makes the program 
run a over 2x as slow.

https://github.com/bowbahdoe/raytracer/compare/no-value-faster...with-value-slower

Is there some intuition that explains this? I am on a stock M1 Arm Mac.

Reply via email to