Hello Luc.
Hi Gilles,
> [...]
[...]
<side rant>Historically, we did not care about thread-safety at all
in [math], assuming the standard use case was *always* going to be
one instance per thread. The statistics aggregators are an example
where multithreaded access makes sense, but this is much more the
exception than the rule in [math]. I would really like to get
clear
on which classes really need to be threadsafe themselves rather
than
blindly assuming that all do and insisting that everything be
immutable so that we don't have to think about how to make things
threadsafe.</side rant>
We would be more able to advance if we consider that "threadsafe
and "immutable" are _not_ interchangeable.
We have seen (in Commons Math also) that immutability can provide
thread-safety in a cheap way. Immutability is mostly easy to
implement when an object's fields are initialized with data that is
guaranteed to not be held by another object, and whose accessors
will forbid modification.
Immutability is not used only for thread safety.
It is an interesting property even for single threaded applications
and
more so for applications that are built from several layers developed
by
different teams (which is obviously the case for [math] as we only
develop the lowest layer.
Without immutability, developers at upper levels do not only have to
know the API, they also have to known the gory details of *all*
underlying levels and how the manage their input. Otherwise, they
need
to perform copies by themselves for defensive programming, and often
forget to do so.
You don't have to convince me that immutability is a nice feature; I
try
to have it everywhere I can. :-)
But by actually trying to have it _in the context the fluent API
paradigm_
I discovered the hard way that it is not always nice (hence this
thread).
Looking on the web about fluent API, most design/usage/examples do not
provide immutability: the primary purpose of "fluent" is to chain
on-the-fly modifications of an object's properties. [This is the
complete
opposite of immutability!]
Then you can mitigate that by having a helper class (aka "builder")
that is
mutable, will collect all the changes, and then create an immutable
instance
of the "interesting" class).
Again, this is a nice trick, but does not get along well with class
hierarchies (cf. a question I asked before in this thread: How to
combine
the builders for all the levels of hierarchy?).
One solution is to duplicate all setter "withXxx" methods at all levels
of
the hierarchy: huge duplication and plenty of potential copy/paste
bugs.
Also, it's not clear to me what exactly is the problem with upper
levels
in this particular case.
Do we need to forbid further calls to "withXxx" methods?
Also, I've indicated that a fluent API that will create a new instance
at every call to "withXxx" can be completely counter-intuitive (and the
source of "subtle" bugs too).
This is related to the previous point: immutabilty or unmodifiability?
Alois, it seems that "final" is (was?) useful for performance
reason. [But again, this would probably be noticeable only with
many allocations. Is there such a case for optimizer objects?]
However, designing a class to be immutable is not always cheap. An
example is the combination that is the subject of this thread; here,
"costly" means: a lot of repetitive code lines[1], which could be
easily avoided by dropping immutability, without loosing anything
because:
1. the code is _not_ threadsafe anyways (with or without "final"),
2. the code could be made thread-safe, with or without "final".
Yes, but this only a part 'and I think a small part) of the deal. I
really think we should do it, even if it is costly from a development
point of view, even for single threaded applications, just for the
better safety it provides for upper levels. And yes, I know this is
not
sufficient to get completely safe code, but it is really a *big* step
forward and helps developers of upper layers to concentrate on other
things.
IMHO, code duplication always indicates that _something_ is wrong.
It might not be obvious to discover what; but the right solution quite
often entails that duplication is eventually eliminated. So: better not
to introduce it in the first place.
Hence, I think that when the functionality is well circumscribed,
and it is easy to make everything "final", it's worth it. But when
it's not that easy, we should indeed not insist on it[2] without
1. use-cases that show the need for multi-threaded usage of the
class, and
2. providing proof that immutability does indeed bring
thread-safety.
I don't agree with 2, for the reasons above (we don't use
immutability
only for the sake of thread safety).
It would really help me to see how the current design in
o.a.c.m.fitting.leastsquares
is "dangerous".
Thanks,
Gilles
best regards,
Luc
I still think that CM should care about multi-threading (reasons
detailed in other posts) but we should focus on tasks where the
developers can readily _measure_ the benefits rather than spend (a
lot of) time designing complex schemes which are not required by
common use-cases.
In addition to "stat", a few other CM areas where MT is of direct
application are: FFT, Genetic Algorithms, Machine-learning, ...
Regards,
Gilles
> [...]
[1] Cf. diff (in recent commits of optimizer classes in package
"o.a.c.m.fitting.leastsquares") between mutable and immutable
versions of the code.
[2] Because we loose something on the "simplicity" side.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org