Hello Luc.

Hi Gilles,


> [...]
[...]

<side rant>Historically, we did not care about thread-safety at all
in [math], assuming the standard use case was *always* going to be
one instance per thread.  The statistics aggregators are an example
where multithreaded access makes sense, but this is much more the
exception than the rule in [math]. I would really like to get clear on which classes really need to be threadsafe themselves rather than
blindly assuming that all do and insisting that everything be
immutable so that we don't have to think about how to make things
threadsafe.</side rant>


We would be more able to advance if we consider that "threadsafe
and "immutable" are _not_ interchangeable.

We have seen (in Commons Math also) that immutability can provide
thread-safety in a cheap way. Immutability is mostly easy to
implement when an object's fields are initialized with data that is
guaranteed to not be held by another object, and whose accessors
will forbid modification.

Immutability is not used only for thread safety.

It is an interesting property even for single threaded applications and more so for applications that are built from several layers developed by
different teams (which is obviously the case for  [math] as we only
develop the lowest layer.

Without immutability, developers at upper levels do not only have to
know the API, they also have to known the gory details of *all*
underlying levels and how the manage their input. Otherwise, they need
to perform copies by themselves for defensive programming, and often
forget to do so.

You don't have to convince me that immutability is a nice feature; I try
to have it everywhere I can. :-)
But by actually trying to have it _in the context the fluent API paradigm_ I discovered the hard way that it is not always nice (hence this thread).

Looking on the web about fluent API, most design/usage/examples do not
provide immutability: the primary purpose of "fluent" is to chain
on-the-fly modifications of an object's properties. [This is the complete
opposite of immutability!]

Then you can mitigate that by having a helper class (aka "builder") that is mutable, will collect all the changes, and then create an immutable instance
of the "interesting" class).
Again, this is a nice trick, but does not get along well with class
hierarchies (cf. a question I asked before in this thread: How to combine
the builders for all the levels of hierarchy?).
One solution is to duplicate all setter "withXxx" methods at all levels of the hierarchy: huge duplication and plenty of potential copy/paste bugs.

Also, it's not clear to me what exactly is the problem with upper levels
in this particular case.
Do we need to forbid further calls to "withXxx" methods?

Also, I've indicated that a fluent API that will create a new instance
at every call to "withXxx" can be completely counter-intuitive (and the
source of "subtle" bugs too).
This is related to the previous point: immutabilty or unmodifiability?

Alois, it seems that "final" is (was?) useful for performance
reason. [But again, this would probably be noticeable only with
many allocations. Is there such a case for optimizer objects?]

However, designing a class to be immutable is not always cheap. An
example is the combination that is the subject of this thread; here,
"costly" means: a lot of repetitive code lines[1], which could be
easily avoided by dropping immutability, without loosing anything
because:
1. the code is _not_ threadsafe anyways (with or without "final"),
2. the code could be made thread-safe, with or without "final".

Yes, but this only a part 'and I think a small part) of the deal. I
really think we should do it, even if it is costly from a development
point of view, even for single threaded applications, just for the
better safety it provides for upper levels. And yes, I know this is not
sufficient to get completely safe code, but it is really a *big* step
forward and helps developers of upper layers to concentrate on other things.

IMHO, code duplication always indicates that _something_ is wrong.
It might not be obvious to discover what; but the right solution quite
often entails that duplication is eventually eliminated. So: better not
to introduce it in the first place.


Hence, I think that when the functionality is well circumscribed,
and it is easy to make everything "final", it's worth it. But when
it's not that easy, we should indeed not insist on it[2] without
1. use-cases that show the need for multi-threaded usage of the
   class, and
2. providing proof that immutability does indeed bring thread-safety.

I don't agree with 2, for the reasons above (we don't use immutability
only for the sake of thread safety).

It would really help me to see how the current design in
 o.a.c.m.fitting.leastsquares
is "dangerous".

Thanks,
Gilles


best regards,
Luc


I still think that CM should care about multi-threading (reasons
detailed in other posts) but we should focus on tasks where the
developers can readily _measure_ the benefits rather than spend (a
lot of) time designing complex schemes which are not required by
common use-cases.
In addition to "stat", a few other CM areas where MT is of direct
application are: FFT, Genetic Algorithms, Machine-learning, ...


Regards,
Gilles

> [...]

[1] Cf. diff (in recent commits of optimizer classes in package
    "o.a.c.m.fitting.leastsquares") between mutable and immutable
    versions of the code.
[2] Because we loose something on the "simplicity" side.




---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Reply via email to