Here's some more context about the issues that the PR attempts to fix.

Currently, serialization identifies special "magic" methods by querying classical reflection. It does so by invoking `getDeclaredMethod(name, parametertypes)`. This works correctly under the assumption that a serializable class contains at most one method with that name and those parameter types. While the assumption is valid in the Java language, it is not necessarily valid for JVM classes.

The specification of `Class::getDeclaredMethod` makes it clear that in case there are multiple methods with the same name and parameter types, the one with the most specific return type is chosen, or an arbitrary one is returned if there's no most specific one. This non-determinism means that serialization might fail to identify the relevant method. Even more, it might identify the relevant method on some implementations of the JDK, or during some runs, and fail to find it in other implementations or in other runs.

For example, suppose a serializable class (in the JVMS sense) contains the following methods (pseudo-Java)
```
    private int writeObject(ObjectOutputStream oos) {...}
    private void writeObject(ObjectOutputStream oos) {...}
```
Neither method has a most specific return type. Depending on how the methods appear in the class, the current implementation of `Class::getDeclaredMethod` returns the first or the second method. Only the second is relevant for serialization. When the first is returned, the relevant method is ignored by serialization, despite being present in the class.

The solution proposed by the PR identifies the relevant method even in such cases. However, this might break behavioral compatibility for classes which, for example, have multiple `writeObject(ObjectOutputStream)` methods.



A similar problem exists for special "magic" fields, which are currently identified by invoking `getDeclaredField(name)`. Here, however, there are a couple of additional issues.

Firstly, the specification of `Class::getDeclaredField` completely ignores the possible presence of multiple fields with the same name. It says nothing about, and returns an arbitrary one of the homonymous fields if there are more.

Secondly, even if the choice is made deterministic, it is unclear which field to choose when there are multiple ones.
Consider a class with 4 `serialPersistentFields` fields (pseudo-Java)
```
private static final Object serialPersistentFields = new ObjectStreamField[0]; private static final Cloneable serialPersistentFields = new ObjectStreamField[0]; private static final Serializable serialPersistentFields = new ObjectStreamField[0]; private static final ObjectStreamField[] serialPersistentFields = new SubclassOfObjectStreamField[0];
```
Which one to choose? Which one is preferable?



We thus face a dilemma: the current behavior is potentially non-deterministic, which is uncomfortable, although it does not seem a problem in practice. On the other hand, any solution to non-determinism can potentially break existing classes at run-time.

Which one is less evil?
Before progressing with the PR, I'd like to hear more opinions.


Greetings
Raffaello

Reply via email to