Re: RFR: 8313961: Enhance identification of special serialization methods

Raffaello Giulietti Wed, 23 Aug 2023 03:24:55 -0700

Here's some more context about the issues that the PR attempts to fix.

Currently, serialization identifies special "magic" methods by queryingclassical reflection. It does so by invoking `getDeclaredMethod(name,parametertypes)`. This works correctly under the assumption that aserializable class contains at most one method with that name and thoseparameter types. While the assumption is valid in the Java language, itis not necessarily valid for JVM classes.

The specification of `Class::getDeclaredMethod` makes it clear that incase there are multiple methods with the same name and parameter types,the one with the most specific return type is chosen, or an arbitraryone is returned if there's no most specific one. This non-determinismmeans that serialization might fail to identify the relevant method.Even more, it might identify the relevant method on some implementationsof the JDK, or during some runs, and fail to find it in otherimplementations or in other runs.

For example, suppose a serializable class (in the JVMS sense) containsthe following methods (pseudo-Java)

```
    private int writeObject(ObjectOutputStream oos) {...}
    private void writeObject(ObjectOutputStream oos) {...}
```

Neither method has a most specific return type. Depending on how themethods appear in the class, the current implementation of`Class::getDeclaredMethod` returns the first or the second method. Onlythe second is relevant for serialization. When the first is returned,the relevant method is ignored by serialization, despite being presentin the class.

The solution proposed by the PR identifies the relevant method even insuch cases. However, this might break behavioral compatibility forclasses which, for example, have multiple`writeObject(ObjectOutputStream)` methods.

A similar problem exists for special "magic" fields, which are currentlyidentified by invoking `getDeclaredField(name)`. Here, however, thereare a couple of additional issues.

Firstly, the specification of `Class::getDeclaredField` completelyignores the possible presence of multiple fields with the same name. Itsays nothing about, and returns an arbitrary one of the homonymousfields if there are more.

Secondly, even if the choice is made deterministic, it is unclear whichfield to choose when there are multiple ones.

Consider a class with 4 `serialPersistentFields` fields (pseudo-Java)
```

private static final Object serialPersistentFields =new ObjectStreamField[0];private static final Cloneable serialPersistentFields =new ObjectStreamField[0];private static final Serializable serialPersistentFields =new ObjectStreamField[0];private static final ObjectStreamField[] serialPersistentFields =new SubclassOfObjectStreamField[0];

```
Which one to choose? Which one is preferable?

We thus face a dilemma: the current behavior is potentiallynon-deterministic, which is uncomfortable, although it does not seem aproblem in practice.On the other hand, any solution to non-determinism can potentially breakexisting classes at run-time.


Which one is less evil?
Before progressing with the PR, I'd like to hear more opinions.


Greetings
Raffaello

Re: RFR: 8313961: Enhance identification of special serialization methods

Reply via email to