Hello.


Sorry for the confusion my code sample caused; it made sense in my mind.
:) I was speaking from the perspective that an abstract class is a
skeletal implementation of an interface, created so that implementing
the interface is easier. For the non-linear least squares (NLLS) package I was thinking something like: https://gist.github.com/anonymous/6184665

Thanks for the effort of writing the example in more details.

To address a few concerns I saw,

Gilles wrote:
In this way, it does not; what I inferred from the partial code above was
that there would only be _one_ "create" method. But:
1. All "create" methods must be overridden at the next level of the class
hierarchy (this is the duplication I was referring to in the first
post).

True, but concrete classes should not be extended.

Why not?
[Anyways, the duplication also occurs in the intermediate (abstract) levels.]

Adding another
abstract class to the hierarchy would mean implementing the create
method form the superclass and delegating to a new abstract create
method that includes the new parameters. The abstract class hierarchy
would have to parallel the interface hierarchy.

With a mutable instance, you avoid this; that's the point.


2. When a parameter is added somewhere in the hierarchy, all the classes
below must be touched too.

True, but since abstract classes are just skeletal implementations of
interfaces you can't add any methods without breaking compatibility
anyway.

Adding a (concrete) method in an abstract class will not break compatibility.

(You would have to add the same method to the public interface
too.)

There you break compatibility; that's why we removed a lot of the
interfaces in CM, because the API is not stable, and abstract classes
allow non-breaking changes.

This does make it important to decide on a well written and
complete API before releasing it.

When the scope of the software is well circumscribed, that would be
possible. With the whole of [Math]ematics, much less so. :-}
And state-of-the-art in Java is a moving target, aimed at by changing
CM contributors with differing needs and tastes; this adds to the
unstable mix.

And, we must note, that the duplication still does not ensure "real"
immutability unless all the passed arguments are themselves immutable.
But:
1. We do not have control over them; e.g. in the case of the optimizers the "ConvergenceChecker" interface could possibly be implemented by a non-thread-safe class (I gave one example of such a thing a few weeks
ago when a user wanted to track the optimizer's search path)

True. Thread safety is a tricky beast. I think we agree that the only
way to guarantee thread safety is to only depend on final concrete
classes that are thread safe themselves.

I don't think so. When objects are immutable, thread-safety follows (but
note that the current optimizers in CM were never thread-safe).
But thread-safety can exist even with mutable objects; it's just that more
care must be taken to ensure it.

This is directly at odds with
the inversion of control/dependency injection paradigm. I think a
reasonable compromise is to depend on interfaces and make sure all the
provided implementations are thread safe.

Yes, that a way, but again easier said that done.

A simple sequential user won't
need to care about thread safety. A concurrent user will need to
understand the implications of Java threading to begin with. Accurate
documentation of which interfaces and methods are assumed to be thread
safe goes a long way here.

I don't think I'm wrong if I say that most concurrent bugs are found in
production rather than in contrived test cases.
[That's why I advocated for introducing "real" applications as use-cases
for CM.]

2. Some users _complained_ (among other things :-) that we should not
force immutability of some input (model function and Jacobian IIRC)
because in some use-cases, it would be unnecessarily costly.

I agree that copying any large matrices or arrays is prohibitively
expensive. For the NLLS package we would be copying a pointer to a
function that can generate a large matrix. I think adding some
documentation that functions should be thread safe if you want to use
them from multiple threads would be sufficient.

I you pass a "pointer" (i.e. a "reference" in Java), all bets are off: the class is not inherently thread-safe. That's why I suggested to mandate a _deep_ "copy" method (with a stringent contract that should allow a caller to be sure that all objects owned by an instance are disconnected from any
other objects).


Consequently, if (when creating a new instance) we assign a reference passed to the fluent method, we cannot warrant thread-safety anymore; which in turn poses the question of whether this false sense of security warrants the increased code duplication and complexity (compare your code below with the same but without the constructors and "create" methods:
even a toy example is already cut by more than half).

Agreed that thread safety can only be guaranteed with the help of the user. The immutable+fluent combination does add an additional layer of indirection. On the other hand it is much simpler to debug and analyze,
especially in a concurrent environment.

Immutable objects can be shared between threads.
Thread-safety is equally obtained if mutable objects are not shared between threads, e.g. keep the optimizer confined in one thread (no implementation
in CM features concurrency anyways).
Even if the optimizers are not immutable, it will be possible to benefit from efficiency improvement brought by concurrency e.g. by ensuring that the objective function is thread-safe, several evaluations could be performed in parallel. [Moreover the optimization logic is not really concurrent in most
optimizers. So why have unnecessarily complicated code?]


Then if we really want to aim at thread-safety, I think that the approach of mandating a "copy()" interface to all participating classes, would be a serious contender to just making all fields "final". Let's also recall
that immutability is not a goal but a means (the goal being
thread-safety).

I still think that the three "tools" mentioned in the subject line do not
play along very well, unfortunately.
I was, and still am, a proponent of immutability but IMO this discussion indicates that it should not be enforced at all cost. In particular, in
small and non-layered objects, it is easy to ensure thread-safety
(through
"final") but the objects that most benefit from a fluent API are big
(with
many configuration combinations), and their primary usage does not
necessarily benefit from transparent instantiation.

Given all these considerations, in the first steps for moving some CM
codes
to fluent APIs, I would aim for simplicity and _less_ code (if just to be
able to make adjustments more quickly).
Then, from "simple" code, we can more easily experiment (and compare,
with
"diff") the merits and drawbacks of various routes towards thread-safety.

OK?

Agreed on the goal of thread safety. Is the copy a shallow copy?

No! (cf. above.)

If it
is, then copy is a complete punt of thread-safety to the user, forcing them to them to determine when and what they need to copy. I think the mutable+copy option would make it harder to understand client code and
understand where copying is necessary.

No; my suggestion, if feasible, is that a user who needs his own, personal,
unshared instance would simply call "copy()":

public void myMethodOne(Optimizer optim) {
  // I don't trust that "optim" can be safe: I make a (deep) copy.
  final Optimizer myOptim = optim.copy();

  // By contract, "myOptim" is assumed to contain unshared fields.
  // I can pass it safely to another thread.
  myMethodTwo(myOptim);
}

private void myMethodTwo(Optimizer optim) {
  // Create a new thread, whatever...
}

But my current point is that this could be developed at a later point,
after the fluent API has been adopted (and more widely tried within CM),
when someone has shown that e.g. the optimizer must be thread-safe to
achieve some actual use-case.

Ultimately the decision is up to
the maintainers and I think both options under discussion are a big
improvement over the current API. :)

Thanks for the great library.

Thanks for the discussion,
Gilles


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Reply via email to