On Thu, 10 Nov 2022 22:56:22 GMT, Jim Laskey <jlas...@openjdk.org> wrote:

>> src/java.base/share/classes/java/lang/template/Carriers.java line 415:
>> 
>>> 413:          */
>>> 414:         @Stable
>>> 415:         private final Object[] objects;
>> 
>> Looking at all this, I'm skeptical that carriers are worth the complexity 
>> for the performance gain they give. AFAICS the fast paths for `FMT` and 
>> `STR` avoid boxing values into a StringTemplate instance altogether. So 
>> carriers only seem to help cases where processing goes through the default 
>> `process` method on a user-defined processor?
>> 
>> AFAIK `@Stable` will not do anything here. It only helps constant folding 
>> from a constant root object and in general StringTemplate instances are not 
>> constant. (and, AFAIK there are also JIT caveats with storing everything in 
>> arrays like this).
>> 
>> I suggest maybe keeping things simple in this PR, and having just 
>> `SimpleStringTemplate`. Public support for the linkage-based fast path can 
>> be used to avoid the boxing for user defined processors in the future as 
>> well, and that seems like a much better way to get performance to me.
>> 
>> If carriers are added to improve performance, I'd personally like to see 
>> that in a separate PR that's only focused on that, with more 
>> benchmarks/bench marking results included as well.
>
> Fair comment. Initially, 99% of template processing will be through `STR``. 
> However, that
> will likely change as libraries expand to include template processing. In 
> fact, creating
> processors is so trivially easy that users will likely start using template 
> processing for
> every day tasks.
> 
> During the preview, string templates will be evaluatedby Java influencers for 
> use in
> future applications. If we come out of the gate with a less than stellar 
> feature
> then string templates will not gain traction.
> 
> Several reviewers, including most of the compiler team, had a problem with 
> the use of
> anonymous classes for each and every string template. Carriers were 
> introduced as an
> alternative. The fact that they perform well is an added bonus. Note Carriers 
> are not a
> last minute thing. I've been using them off/on for much of the project.
> 
> Carriers also have a place in future planned projects. It was Brian that 
> inspired the idea.
> 
> Carriers, as well as the FormatProcessor, are indeed projects unto 
> themselves, but are a
> important part of the string template project. Breaking those projects out 
> separately
> would risk not making preview. It's important to get these classes tested in 
> real use
> cases during preview. In the balance, as you indicated, STR and FMT are the 
> show. Carriers
> are low risk with big win.
> 
> The linkage stuff will likely dissolve into something that doesn't require a 
> lot of 
> MethodHandle experience, something like a TemplateProcessorFactory.
> 
> I whipped up a trivial JMH benchmark to give a sense of Carrier worth. Note 
> I'm not a
> performance engineer and milage may vary, but this will give you the sense of 
> degree
> difference between Carriers and a list implementation.
> 
> 
> 
> public class MyBenchmark {
>     // Declared globally to spoil optimization.
>     static int x = 10, y = 20;
>     static String result;
>     static StringProcessor INTERPOLATE = st -> st.interpolate();
>     static List<String> FRAGMENTS = List.of("", " + ", " = ", "");
> 
>     // String concatenation as a baseline.
>     @Benchmark
>     public void concat() {
>         result = x + " + " + y + " = " + (x + y);
>     }
> 
>     // STR is effectively String concatenation.
>     @Benchmark
>     public void str() {
>         result = STR."{x} + {y} = {x + y}";
>     }
> 
>     // StringTemplate using a carrier and thus the constructed interpolate.
>     @Benchmark
>     public void carrier() {
>         result = INTERPOLATE."{x} + {y} = {x + y}";
>     }
> 
>     // StringTemplates using lists for fragments and values. 
>     // Note that FRAGMENTS is only constructed once as it would be in the 
> wilds.
>     @Benchmark
>     public void lists() {
>         StringTemplate st = StringTemplate.of(FRAGMENTS, List.of(x, y, x + 
> y));
>         result = st.interpolate();
>     }
> 
>     // Formatting can be effectively String concatenation.
>     @Benchmark
>     public void format() {
>         result = FMT."%d{x} + %d{y} = %d{x + y}";
>     }
> 
> }
> 
> 
> 
> 
> Benchmark                   Mode  Cnt         Score         Error  Units
> MyBenchmark.concat         thrpt   25  52570142.770 ± 5373985.927  ops/s
> MyBenchmark.str            thrpt   25  52569879.770 ± 2787317.140  ops/s
> MyBenchmark.carrier        thrpt   25  54161402.682 ± 4587727.795  ops/s
> MyBenchmark.lists          thrpt   25   9260123.476 ±  189741.261  ops/s
> MyBenchmark.format         thrpt   25  50847393.625 ±  135673.614  ops/s
> 
> 
> Thanks for pointing out the @Stable issue.

Thanks, I think benchmarks like this are useful. The `interpolate()` case is 
not something I considered when I made my earlier comment. Please add any 
benchmarks to the patch as well, so that performance experiments can be  
reliably reproduced.

I think what this benchmark mainly shows is the benefit of carriers in 
combination with the instance `interpolate()` method. I expect most custom 
string processors will interact with string template instances through the 
`fragments()` and `values()` methods, though (I don't think there is an 
alternative?). i.e. in the end, we still end up boxing everything into a couple 
of lists any ways. `interpolate()` also has another leg up in that it knows the 
shape of the template when the string template instance is constructed, and can 
use a specialized method handle to do the processing.

If I change the benchmark to rely on the `fragments()` and `values()` accessors 
(which I think the majority/all of the custom processor will have to do), for 
instance by using `st -> StringTemplate.interpolate(st.fragments(), 
st.values())` as a custom processor, and I tweak 
`StringTemplateImplFactory::newStringTemplate` to use `List::copyOf` instead of 
the stream it uses now [1], then the list based implementation is slightly 
faster than the carriers based implementation on my machine:


Benchmark              Mode  Cnt   Score   Error  Units
Carriers.carrier       avgt   15  69.946 � 0.599  ns/op
Carriers.carrier_inst  avgt   15  15.014 � 0.106  ns/op
Carriers.concat        avgt   15   8.441 � 0.021  ns/op
Carriers.format        avgt   15   9.314 � 0.172  ns/op
Carriers.lists         avgt   15  61.131 � 0.401  ns/op
Carriers.str           avgt   15   8.547 � 0.145  ns/op


It makes me wonder if carriers are worth it at this stage (hard to say...), or 
if it's better to go all-in on custom linkage, which should give custom 
processors performance that is comparable with `STR` and `FMT`.

Maybe the instance `interpolate()` method is an important enough use-case to 
add carriers though. I can't really say.

I appreciate the fact that carriers will see use in future projects as well, 
but at face value, it doesn't seem like they add that much for string templates 
(rather, on the contrary in some cases it seems).

[1]: http://cr.openjdk.java.net/~jvernee/bench.diff

-------------

PR: https://git.openjdk.org/jdk/pull/10889

Reply via email to