Re: [Math] Make everything "Serializable" ?

sebb Sat, 11 Feb 2012 05:05:17 -0800

On 11 February 2012 12:29, Luc Maisonobe <luc.maison...@free.fr> wrote:
> Le 11/02/2012 11:25, sebb a écrit :
>> On 11 February 2012 09:26, Luc Maisonobe <luc.maison...@free.fr> wrote:
>>> Hi,
>>>
>>> Le 11/02/2012 05:10, Bill Barker a écrit :
>>>> While the development team has exploded for [MATH], maintaining
>>>> Serializable interfaces is expensive and historically hasn't been kept
>>>> up.
>>>
>>> I don't agree.
>>>
>>> Maintaining serialization maintainance is hard only if you want to be
>>> able to deserialized with version n something that was serialized with
>>> version p. This situation appears either if you use serialization for
>>> long term storage or if you have a distributed application with two
>>> parts using different levels. The first case is a wrong use of
>>> serialization. The second case can be simply declared to be unsupported.
>>>
>>> Apart from that, maintaining serialization is updating serialVersionId,
>>> which we did (and IMHO it is no big deal if it is not done due to the
>>> reasons explained in previous paragraph).
>>
>> There's a bit more to it than that.
>>
>> If a class is updated to add new fields, these either need to be
>> serialisable themselves, or be marked transient.
>> And transient can only be used if the field is not needed until after
>> deserialisation.
>
> Yes, but checkstyle helps a lot detecting this.


Checkstyle (and Findbugs) warn you if the field is not serialisable.

Do they also warn you if a transient field is not set by deserialisation?
[Cannot remember offhand]

Even if so, each such report would have to be investigated to
establish whether it is OK to omit the field init or not, and if it is
OK to omit, this would need to be documented and the checkstyle config
updated to skip the warning.

>> Also, some consideration must be given to the serialised form, to
>> ensure it is appropriate.
>
> I don't understand, can you explain a little more ?

The default serialised form effectively copies the physical
representation of an object.
For simple objects that works fine, but for some data structures it
may be totally unsuitable.

An example from Effective Java [Bloch] is of a class that implements a
double-linked list of Strings.
The links are a big (and unnecessary) overhead on the serialised form;
only the Strings themselves really need to be passed across.
[The simple approach is to serialise the count, followed by the Strings].
Using the default serialisation will be a lot slower and may even
cause stack overflow as the default implementation uses recursive
traversal of the links.

Another example - serialising a hash table: hashes are not guaranteed
portable, so the deserialised form may be corrupt.

There's also the issue that de-serialisation does not use the
constructor, so any invariants which are established by the ctor (e.g.
non-null parameter) also need to be maintained by the readObject
method.

>>
>> Serialisation and final fields don't play well together.
>>
>>>> So I would go for requiring the user to do something like:
>>>>
>>>> public class MyPolynomialSplineFunction extends
>>>> PolynomialSplineFunction, implements Serializable {
>>>>  private static long serialVersionUID = <something>;
>>>>
>>>>   // put non-default constructors here
>>>> }
>>>
>>> I'm not sure it scales with more than one level of aggregation. What
>>> would happen if the user wants to serialize one of our top level objects
>>> that itself embeds one of our lower level objects ?
>>
>> Or indeed if PolynomialSplineFunction is final?
>>
>>>>
>>>> it is less than a minute to do this in eclipse, so it should be on the
>>>> user for classes like this.
>>>
>>> I am strongly in favor of putting Serialize where it can be put. Once
>>> again, it's not an absolute rule which should be done in one sweep, it's
>>> more a small low priority regular task to add serialize here and there
>>> as we see.
>>
>> Even without trying to maintain version compatibilty, adding
>> Serialisation can be non-trivial.
>>
>> It should also be tested - I don't think there are any unit tests yet.
>
> All step handlers in the ODE package are tested for serialization, as
> well as EmpiricalDistribution and the various descriptive statistics
> classes, polynomial function, complex, fractions, vectors, matrices ...
>
> There are even dedicated methods in TestUtils: serializeAndRecover and
> checkSerializedEquality.

OK, I'd not noticed those.

However, adding Serializable still means devising and implementing the
appropriate unit tests.

==

I'm not saying don't do it, just that it involves a lot more work than
might be obvious initially.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [Math] Make everything "Serializable" ?

Reply via email to