Re: [math] RealMatrix.set(double)

Sébastien Brisard Wed, 24 Aug 2011 23:10:38 -0700

Hi,
following Greg's suggestion, here is a first attempt at summarizing
what I understood from the previous discussions regarding
RealVector/RealMatrix interfaces. If we finally drop the matter, as
suggested by Phil, this will be just that: a summary. Otherwise, maybe
this list could be moved to a WIKI page, so that people could freely
edit it, add missing items, and remove all errors that will surely
appear below.
Best regards,
Sébastien


DISCLAIMER: I'm a rather new user of CM, so there are probably many
errors below. I do apologize for that. Also, I am unaware of many
older discussions which help understand why the interfaces are
designed the way they are now. In other words, I am not the best
qualified person to do the job below, but I found some time to do a
first draft... Do not mistake any of my errors below for a dismissive
assertion, I would not dare to be judgemental.

So here goes

* Interfaces vs. abstract classes
This is missing from the above discussion, but the matter was raised
recently regarding RealVectors. The suggestion was to
  - get rid of interface RealVector
  - rename AbstractRealVector as RealVector.
It was said that that was "scary", but certainly worth it... Same goes
to RealMatrix/AbstractRealMatrix.

* Consistency in naming and make explicit the rationale for choosing
one or another naming scheme
I believe this item refers to methods ebeXXX and mapXXX in RealVector
(see JIRA MATH-643). Whether or not the ebe- and map- prefixes
referred to the same concept was debated.

My understanding is that
  - map means "apply the same univariate function to all elements of
this vector"
  - ebe means "apply the same bivariate function to all elements of
this and the specified vector"

Here, "function" should be understood in a general sense, see for
example RealVector.mapAdd(double), or
RealVector.ebeMultiply(RealVector). From this point of view, some
methods in RealVector are inconsistently named. For example,
RealVector.combine(double, double, RealVector) should probably be
renamed RealVector.ebeCombine(double, double, RealVector).

Arguably RealVector.add, RealVector.sub should also be named
RealVector.ebeAdd, RealVector.ebeSub, but that would probably be
taking this line of reasoning too far.

Note that this naming convention (map- vs. ebe-), which is pretty
self-explanatory, is apparently *not* used in RealMatrix.

* Consistency of operations (and naming) between "RealVector" and "RealMatrix"

Vectors and matrices are two very similar concepts, so same operations
on both types of objects should be named consistently. However, there
are a few inconsistencies. I'll try to start the list, which should
probably be extended by others

** Methods which exist in RealVector, but not in RealMatrix
Here is a list of methods implemented in RealVector, but not in
RealMatrix, although they would also make sense in the latter. Whether
they should be implemented in RealMatrix, removed from RealVector, or
left as-is remains to be decided.
  - RealVector.combine(double, double, RealVector)
  - RealVector.ebeDivide(RealVector)
  - RealVector.ebeMultiply(RealVector)
  - RealVector.isInfinite()
  - RealVector.isNan()
  - RealVector.iterator()
  - RealVector.map(UnivariateRealFunction)
  - RealVector.mapAddToSelf(double)
  - RealVector.mapDivide(double)
  - RealVector.mapDivideToSelf(double)
  - RealVector.mapMultiplyToSelf(double)
  - RealVector.mapSubtract(double)
  - RealVector.mapSubtractToSelf(double)
  - RealVector.mapToSelf(UnivariateRealFunction)
  - RealVector.set(double)
  - RealVector.sparseIterator()
  - RealVector.toArray()
  - all the various norms

** Methods which exist in RealMatrix, but not in RealVector
Same comments.
  - RealMatrix.addToEntry(int, int, double)
  - RealMatrix.createMatrix(int, int) could be useful as
RealVector.createVector(int)
  - RealMatrix.multiplyEntry(int, int, double)
  - RealMatrix.getSubMatrix(int[], int[]) could be implemented as
RealVector.getSubVector(int[])

** Mapping a Univariate function vs. visiting a matrix
Similar functional concepts are defined in both interfaces
  - RealVector can map a UnivariateRealFunction
  - For RealMatrix can map a
RealMatrixChangingVisitor/RealMatrixPreservingVisitor

Both approaches are different, since in RealVector, there is no
reference to the index of the current entry. I personally think that
both approaches are equally useful. First, mapping a
UnivariateRealFunction is easy, since quite a lot of them are already
defined CM. Second, the RealMatrixVisitor approach is more general,
and allows to carry out almost everything you've ever dreamed of
(including in-place operations) [DISCLAIMER: see Greg's answer
reproduced below]. Maybe *both* interfaces could have *both*
approaches?

Greg argued that the visiting approach was limited. I'm not sure I
understood the whole argument, so, for the sake of completeness, I
take the liberty to quote him directly.
<q>
There is a lot to like in the WalkInOrder* set of methods. However, it
is also very constricting. What if I want to set a whole row with a
Arrays.copyTo() call? Also, the interface is a push interface. Data is
pushed to the delegate. This is very troublesome to me. I might need
random access to the whole storage space. I suppose you could solve
this by stashing a copy of the data in your
RealMatrixPreservingVisitor implementation. That seems clunky and
likely to cause very obtuse coding. If you are changing the data
rapidly, you will need to constantly update your cached matrix data.
</q>

** Redundancies
These are methods which I believe perform the same operations (am I right?)
  - RealVector.getData() and toArray()
... to be completed.

** Signature inconsistencies
To be completed.

** Naming inconsistencies
  - RealVector.mapAdd(double) and RealMatrix.scalarAdd(double)
  - RealVector.mapMultiply(double) and RealMatrix.scalarMultiply(double)
  - RealVector.getNorm and RealMatrix.getNorm are not the same norms.
More generally, there are both interfaces provide access to many
different norms (with different names, I believe, across interfaces).

* Which methods should be added
  - Visiting (with reference to current index) concept in RealVector?
  - Mapping (without reference to current index) concept in RealMatrix?

* Which methods should be removed
Citing again Greg
<q>
Perhaps methods which are not called internally by commons might be
candidates for excision.
</q>

* What API to adopt to let user "create" entry-changing functions
If I didn't misundersand the suggestion, I believe that the following
proposed interface
public interface MatrixModifier{
    public void updateData( double[][] memberData );
}
would not work, since not all matrices hold their data in double[][] arrays.

* Thread named "(MATH-608) Remove methods from RealMatrix Interface"
I am not really sure this thread came to a conclusion, it seems to me
it is very related to the present discussion. Anyone to write about
this issue?

* One personal thought
I personnaly have come to dislike the schizophrenia in the RealVector
interface between double[] and RealVector. As double[] is the simplest
representation of a vector, all methods which take a RealVector as an
argument in the RealVector interface are duplicated to also take a
double[] as an argument. While this is very flexible for end-users, it
is a bit of a pain when you want to extend this interface in a
consistent way (and it also make the classes implementing RealVector
quite cluttered). I'm just wondering what the real benefit is, since
the existing hierarchy allows (at virtually no cost) the creation of a
RealArrayVector from a double[] without taking first a (costly) deep
copy of the specified double[].
For example, for an end-user, it's not much of a hassle to write
v.add(new ArrayRealVector(w, false))
instead of v.add(w)
w being a double[].

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [math] RealMatrix.set(double)

Reply via email to