Hi,
On 7/19/2017 5:49 AM, Peter Levart wrote:
Hi Claes,
On 07/17/2017 02:16 PM, Claes Redestad wrote:
Hi Peter!
On 2017-07-15 14:08, Peter Levart wrote:
It seems that interning signature(s) is important for correctness
(for example, in ObjectOutputStream.writeTypeString(str) the 'str'
is used to lookup a handle so that handles are put into stream
instead of the type signature(s) for multiple references to the same
type). Looking up objects in handles table is based on identity
comparison.
Yes, interned signatures is important for correctness (and performance?)
of the current serialization implementation.
But there might be a way to obtain a singleton signature String per
type and still profit. By adding a field to java.lang.Class and
caching the JVM signature there. This would also be a useful public
method, don't you think?
I have a nagging feeling that we should be careful about leaking
implementation details about the underlying VM through public APIs,
since making changes to various specifications is hard enough as it is.
You're right. There's already more than enough "implementation
details" that pertain to JVM exposed through reflection API which was
supposed to represent Java - the language - view of the world. JVM
type signatures just happen to be used in serialization too, which is
another implementation detail which might change in the future (with
value types etc), so it's better to keep it private.
right
Out of 191 ObjectStreamField constructions I found in JDK sources,
there are only 39 distinct field types involved, so the number if
intern() calls is reduced by a factor of ~5. There's no need to
cache signature in ObjectStreamField(s) this way any more, but there
must still be a single final field for ObjectStreamField(s)
constructed with explicit signature(s).
Here's how this looks like in code:
http://cr.openjdk.java.net/~plevart/misc/Class.getJvmTypeSignature/webrev.01/
Could this be done as a ClassValue instead of another field on Class? My
guess is only a small number of classes in any given app will be
directly
involved in serialization, so growing Class seems to be a pessimization.
It could be, yes. We are trying to solve two issues here. One is the
original 8184603 which is concerned with start-up overhead and your
proposal is the right solution for it as it only delays the work to
when/if it is needed. The other issue is overheads of repeatable
signature interning. These are not frequent enough for cases that just
create a bunch of ObjectStreamField instances assigned to static final
fields, but I suspect are more frequent when signatures are being
de-serialized from stream. At that time, we don't yet have a Class
object to go with the signature and to use as a caching anchor, but we
still want to keep the invariant of OSF signature(s) being interned
Strings. If they really need to be interned right away in that case is
a question which needs more studying of deserialization code.
The pacakge-private ObjectStreamField constructor(name, signature,
unshare) is used only
to create temporary OSF objects during deserialization. Those OSF
instances are compared
with the OSF instances created from the local class to determine common
fields.
The signature.intern() in that constructor is not significant.
The signature.intern() in the public constructor is not important for
correctness,
comparisons between signatures use equals.
It may have a slight performance or size impact on the object streams
because otherwise
equivalent signatures will be serialized as separate strings.
What do you think?
I wonder what workloads actually see a bottleneck in these String.intern
calls, and *which* String.intern calls we are bottlenecking on in these
workloads. There's still a couple of constructors here that won't see a
speedup.
Right. I suspect the intern() call bottleneck is most problematic when
deserializing. All other cases could be optimized by caching the
signature on the appropriate Class object(s) via ClassValue for example.
I'd remove the intern in the package-private constructor.
I think we need more data to ensure this is actually worthwhile to
pursue,
or whether there are other optimizations on a higher level that could
be done.
Ok, we agree that no new public API for JVM signatures is desired and
the problem of intern() calls bottleneck when deserializing should be
researched more deeply. I agree that your solution is currently the
best for the original issue.
Ditto.
Roger