On 06/11/2013 21:54, Robert Stupp wrote:
Hi,
I was wondering why the mostly allocated class in nearly all applications is char[]. A deeper
inspection showed that a lot of these char[] allocations are "caused" by the code from
java.lang.StringBuilder.toString(), which created a copy of its internal char array. Most
StringBuilder instances are no longer used after the call to StringBuilder.toString(). Many of
these instances have been created by javac caused by "plain" string concatenation in
source code.
Wouldn't it worth to try whether passing the (Abstract)StringBuilder's value+count values
to String results in less temporary object creations and therefore reduce pressure on new
generation (and reduce GC effort)? My idea is to add a field 'shared' to
AbstractStringBuilder, which is set when StringBuilder.toString() is called. If the
StringBuilder is really modified after calling toString(), the StringBuilder creates a
new copy of the value array and resets the 'shared' field. Since the value array might be
longer than the current count, String class would need a "re-invention" of the
count field.
Another think I noticed is that the StringBuilder instances transiently created by javac
seem to use the default constructor. But a huge amount of string concatenations in Java
code result in (much) longer Strings, which means that each append creates a new, larger
copy of the value array in AbstractStringBuilder. Is it possible to add some
"guessing" for the initial capacity - this would eliminate a lot of temporary
objects and reduce GC effort. Is it worth to check this out? Are the two places in
com.sun.tools.javac.jvm.Gen#visitAssignop/visitBinary the only places where these
StringBuilder instances are created?
-
Robert
I can't speak to the details but HotSpot has an optimization that
recognizes some cases like
new-StringBuilder(...).append(...).append(...).toString() where it can
avoid the copy. This may be something to follow-up on the hotspot list
if you want to get into the details.
-Alan.