On Thu, 28 Jul 2022 15:42:15 GMT, Quan Anh Mai <d...@openjdk.org> wrote:
>> I would like to contribute an optimized version of >> `StackTraceElement#toString()` that uses a single StringBuilder throughout >> creation to avoid intermediate `String` allocations. >> `StackTraceElement#toString()` is used in a number of JDK code paths >> including `Throwable#printStackTrace()`, as well as many JDK consumers may >> transform `StackTraceElement` `toString()` in logging frameworks capturing >> throwables and exceptions, and diagnostics performing dumps. >> >> Given this usage and some observed JFR profiles from production services, >> I'd like to reduce the intermediate allocations to reduce CPU pressure in >> these circumstances. I have added a couple benchmarks for a sample >> `Throwable#printStackTrace()` converted to String via `StringWriter` and >> individual `StackTraceElement` `toString`. The former shows ~15% >> improvement, while the latter shows ~40% improvement. >> >> Before >> >> Benchmark Mode Cnt Score Error >> Units >> StackTraceElementBench.printStackTrace avgt 15 167147.066 ± 4260.521 >> ns/op >> StackTraceElementBench.toString avgt 15 132.781 ± 2.095 >> ns/op >> >> >> After >> >> Benchmark Mode Cnt Score Error >> Units >> StackTraceElementBench.printStackTrace avgt 15 142909.133 ± 2290.720 >> ns/op >> StackTraceElementBench.toString avgt 15 78.939 ± 0.469 >> ns/op > > If the objective of this patch is to optimise > `StackTraceElement::toString()`, then I would suggest the most efficient > method of calculating the length and encoding of the result, filling the > backing buffer then using the package-private constructor of > `java.lang.String` to construct the final result. > > And if you want to propose exposing > `StackTraceElement::appendTo(Appendable)`, then I think it should be a > separate patch instead. > > Thanks. Thanks for the reviews and comments! To @merykitty 's point, I am now focusing this PR on optimizing `StackTraceElement.toString()` only -- additional changes can follow on in subsequent PRs. I tested out a couple alternative implementations, and have currently landed on precomputing the StringBuilder capacity as seen in current state of this PR (commit https://github.com/openjdk/jdk/pull/9665/commits/aca442839c82688566d94cb5cb43e34b5b029642 ). This shaves a bit more off the `StackTraceElementBench.toString` microbenchmark compared to not precomputing the StringBuilder capacity. The trade off is more readability and more complex code as there are 2 passes -- one to precompute length and one to build the resulting String. Benchmark Mode Cnt Score Error Units StackTraceElementBench.printStackTrace avgt 15 143218.862 ± 840.141 ns/op StackTraceElementBench.toString avgt 15 66.566 ± 0.288 ns/op Alternately, I computed the length and collected into a `char[]` then converted to String as seen in commit https://github.com/openjdk/jdk/commit/de0a337e7f18f9a1525ec222ca22ced93166bf2e . This ends up similar to the presized StringBuilder, with even more complex code, so doesn't seem worth going that route. Benchmark Mode Cnt Score Error Units StackTraceElementBench.printStackTrace avgt 15 133324.365 ± 1133.849 ns/op StackTraceElementBench.toString avgt 15 68.701 ± 1.574 ns/op I did not write up an implementation to buffer & encode to a `byte[]` and create a resulting string from that via the String constructors or `newStringUTF8NoRepl`. I'm not sure that paying the complexity of UTF-16 handling directly would be worth improvement removing the `char[]` -> `byte[]` conversion & copy, especially in most common case of Latin 1 `StackTraceElement` contents. ------------- PR: https://git.openjdk.org/jdk/pull/9665