----- Original Message -----
> From: "cay horstmann" <[email protected]>
> To: "valhalla-dev" <[email protected]>
> Sent: Saturday, November 1, 2025 5:58:12 PM
> Subject: Re: Raytracing Experience Report

> You are on the right track replacing the array list with an array.
> 
> But if you want the VM to flatten the array, you need to use the non-public 
> API
> for now. And you need to tell the VM that the values are never null, and if
> they are > 64 bits, that you don't care about tearing.
> 
> https://horstmann.com/presentations/2025/jfn-valhalla/#(15)
> https://horstmann.com/presentations/2025/jfn-valhalla/#(17)
> 
> Cheers,
> 
> Cay

Or you can use an implementation of List that uses specialization

https://github.com/forax/weather-alert/blob/master/src/main/java/util/FlatListFactory.java#L433

Rémi

> 
> Il 30/10/2025 19:08, Ethan McCue ha scritto:
>> I did try that - by replacing the ArrayList<Material> with a Sphere[] - there
>> was a modest speedup. But the C++ code itself uses an abstract hittableclass
>> and has a std::vector<shared_ptr<hittable>>. So if I were to make that change
>> in the Java version to get better performance I would feel the need to do the
>> same in the C++ or else it would not be a fair comparison.
>> 
>> The only other thing I could think of - replacing Optional<HitRecord>with a
>> nullable HitRecord - didn't move the needle. VisualVM doesn't support the EA 
>> so
>> I'm not experienced in how I would need to dig down. It is possible
>> System.out.println might be the bottleneck now, but I somewhat doubt it.
>> 
>> 
>> 
>> On Thu, Oct 30, 2025 at 1:56 PM Piotr Tarsa <[email protected]
>> <mailto:[email protected]>> wrote:
>> 
>>     Hi Ethan,
>> 
>>     IIRC Valhalla still doesn't have reified nor specialized generics in any 
>> way, so
>>     anything generic, like List<Whatever> or Optional<Whatever>, is erased to
>>     non-generic form. The layout of generic classes is not specialized to the
>>     generic parameter, but instead the instances of 'Whatever' are 
>> unconditionally
>>     boxed. I think that the first step to get performance closer to C++ with
>>     current Valhalla state would be to avoid all generics in hot execution 
>> paths
>>     and then redo the experiments.
>> 
>>     Regards,
>>     Piotr
>>     
>> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>>     *Od:* valhalla-dev <[email protected]
>>     <mailto:[email protected]>> w imieniu użytkownika Ethan McCue
>>     <[email protected] <mailto:[email protected]>>
>>     *Wysłane:* czwartek, 30 października 2025 18:21
>>     *Do:* Sergey Kuksenko <[email protected]
>>     <mailto:[email protected]>>
>>     *DW:* [email protected] <mailto:[email protected]>
>>     <[email protected] <mailto:[email protected]>>
>>     *Temat:* Re: Raytracing Experience Report
>>     Continuing from this, I ran it against the reference C++ implementation 
>> and got
>>     these numbers.
>> 
>>     # Reference C++ implementation (-O3)
>> 
>>     ```
>>     real 6m35.702s
>>     user 6m33.780s
>>     sys 0m1.454s
>>     ```
>> 
>>     # Java With Value Classes
>> 
>>     ```
>>     real 11m50.122s
>>     user 11m36.536s
>>     sys 0m13.281s
>>     ```
>> 
>>     # Java Without Value Classes
>> 
>>     ```
>>     real 17m1.038s
>>     user 16m40.993s
>>     sys 0m29.400s
>>     ```
>> 
>>     I am wondering if using an AOT cache could help catch up to the C++, but 
>> I get a
>>     class file version error running with -XX:AOTCache=value.aot
>> 
>>     Error: LinkageError occurred while loading main class Main
>>              java.lang.UnsupportedClassVersionError: Main has been compiled 
>> by a more recent
>>              version of the Java Runtime (class file version 70.0), this 
>> version of the Java
>>              Runtime only recognizes class file versions up to 69.0
>> 
>>     On Wed, Oct 29, 2025 at 3:44 PM Sergey Kuksenko 
>> <[email protected]
>>     <mailto:[email protected]>> wrote:
>> 
>>         Hi Ethan,
>> 
>>         Thank you for the information. Your example and the code are pretty
>>         straightforward, and I was able to repeat and diagnose the issue.
>> 
>>         The fact is, the performance issue is not directly related to value 
>> classes. The
>>         problem is that HittableList::hit method (invoked at 
>> Camera::rayColor) was
>>         inlined by JIT in the non-value version and wasn't inlined in the 
>> value classes
>>         version.
>>         When you inline that invocation manually, you should get the same 
>> performance
>>         for both versions.
>>         HittableList::hit was not inlined in the value classes version 
>> because value
>>         classes resulted in a different code size and changed the inline 
>> heuristics.
>>         It's a mainline issue; you'll encounter it quite rarely. Current 
>> inline
>>         heuristics work well in 99% of cases, and you should be very lucky 
>> (or unlucky)
>>         to get it in real life.
>> 
>>         Best regards,
>>         Sergey Kuksenko
>> 
>> 
>> 
>>         ________________________________________
>>         From: valhalla-dev <[email protected]
>>         <mailto:[email protected]>> on behalf of Ethan McCue
>>         <[email protected] <mailto:[email protected]>>
>>         Sent: Monday, October 27, 2025 5:08 PM
>>         To: [email protected] <mailto:[email protected]>
>>         Subject: Raytracing Experience Report
>> 
>>         Hi all,
>> 
>>         I have been following along in the "Ray Tracing in a Weekend" book 
>> and trying to
>>         make as many classes as possible value classes. (Vec3, Ray, etc.)
>> 
>>         https://github.com/bowbahdoe/raytracer 
>> <https://github.com/bowbahdoe/raytracer>
>> 
>>         https://raytracing.github.io/books/RayTracingInOneWeekend.html
>>         <https://raytracing.github.io/books/RayTracingInOneWeekend.html>
>> 
>>         (without value classes)
>> 
>>         time java --enable-preview --class-path build/classes Main > 
>> image.ppm
>> 
>>         real 4m33.190s
>>         user 4m28.984s
>>         sys 0m5.511s
>> 
>>         (with value classes)
>> 
>>         time java --enable-preview --class-path build/classes Main > 
>> image.ppm
>> 
>>         real 3m54.623s
>>         user 3m52.205s
>>         sys 0m2.064s
>> 
>>         So by the end the version using value classes beats the version 
>> without them by
>>         ~14% using unscientific measurements.
>> 
>>         But that is at the end, running the ray tracer on a relatively large 
>> scene with
>>         all the features turned on. Before that point there were some 
>> checkpoints where
>>         using value classes performed noticeably worse than the equivalent 
>> code sans
>>         the value modifier
>> 
>>         https://github.com/bowbahdoe/raytracer/tree/no-value-faster
>>         <https://github.com/bowbahdoe/raytracer/tree/no-value-faster>
>> 
>>         real 1m22.172s
>>         user 1m9.871s
>>         sys 0m12.951s
>> 
>>         https://github.com/bowbahdoe/raytracer/tree/with-value-slower
>>         <https://github.com/bowbahdoe/raytracer/tree/with-value-slower>
>> 
>>         real 3m34.440s
>>         user 3m19.656s
>>         sys 0m14.870s
>> 
>>         So for some reason just adding value to the records/classes makes 
>> the program
>>         run a over 2x as slow.
>> 
>>         
>> https://github.com/bowbahdoe/raytracer/compare/no-value-faster...with-value-slower
>>         
>> <https://github.com/bowbahdoe/raytracer/compare/no-value-faster...with-value-slower>
>> 
>>         Is there some intuition that explains this? I am on a stock M1 Arm 
>> Mac.
>> 
> 
> --
> 
> Cay S. Horstmann | https://horstmann.com

Reply via email to