Re: [Math] "iterator" and "sparseIterator" in "RealVector" hierarchy

Gilles Sadowski Mon, 15 Aug 2011 15:02:42 -0700

> > > I'm in favor of moving some methods to the SparseXXX interface, but
> > > got voted down at the time. For application developers (like me),
> > > that can expect in advance if the Vector/Matrix is sparse or not it
> > > isn't a big deal. But I can see how it may cause problems for other
> > > libraries that want to leverage C-M.  And actually, having problems
> > > seeing why it is a big deal in general.  If I'm doing an operation
> > > like outer product, I would still prefer that the iterator skips the
> > > zero entries.
> >
> > I'm wondering whether, depending on the application field, one does not
> > know
> > in advance that one should use sparse implementations ("OpenMapRealVector"
> > and "OpenMapRealMatrix"). If those objects are used consistently throughout
> > the application, the operations should all leverage the sparseness
> > optimizations, thanks to polymorphism.
> >
> 
> Polymorphism is really too limited here to get good results.
> 
> And it really helps for a dense matrix to optimize operations against a
> sparse operand as in dense.plus(sparse).


Applications that manipulate sparse things would know the limitations of
polymorphism and would write
  sparse.plus(dense)

> If all matrices implement isSparse
> and sparseIterator, then the general implementation of plus here can have
> one test and get almost all of the benefits of specially coded methods.

One could consider that this comes at the expense of code clarity and a
slightly decreased efficiency for basic (dense) usage.

>  Those tests are needed since it isn't reasonable to assume that the dense
> implementation knows about all types of matrices if users are allowed to
> implement matrices.

Then, isn't there a problem in CM anyways because of those "instanceof"
operators, in "AbstractRealVector" and others? A superclass testing the type
of its subclasses does not feel right...

> Moreover, a simple single inheritance type structure is
> not rich enough to express the different kinds and conditions of a matrix.

So why not multiple hierarchies instead of "one size fits all"?

> I would say that the argument should go the other way.  Do you have a unit
> test demonstrating that having these methods in the general case *hurts*
> performance?  It obviously helps.
> 

https://issues.apache.org/jira/browse/MATH-628

> 
> > Could there be specific unit tests demonstrating the necessity of having
> > something like "Iterator<Entry> sparseIterator()" in "RealVector"? The
> > drawback is obviously that in dense implementations, this method must be
> > implemented but is used only in those cases where the object should have
> > been "sparse" in the first place.
> 
> 
> This "drawback" is trivial.  In the abstract implementation, you can just
> delegate to the dense iterator.  Thus, RealVector never needs to know.

In "AbstractRealVector", there is an implementation of "SparseEntryIterator"
but the Javadoc states:
 "[...] Concrete subclasses which are SparseVector implementations should
  make their own sparse iterator, not use this one. [...]"

That does not feel right either.

> 
> > Unless I'm mistaken, it looks as if it
> > would sufficient to have "converters" between sparse and dense
> > implementations.
> >
> 
> I think you are mistaken.  Converters are a royal pain.
> 
> Taking a case near to my heart, it is really, really important for gradient
> descent implementations of logistic regression to do smart things with
> sparse feature vectors.  On the other hand, they also need to handle dense
> feature vectors.  The internal coefficient matrix is, however, dense and
> there are half a dozen or more built-in kinds of sparse matrices.  There are
> even several implementations of dense matrices.
> 
> Almost all of the operations are actually on the internal coefficient
> matrix.  If the basic operations on a dense matrix all handle sparse cases
> correctly, then the learning code can declare the feature vector to be a
> Vector.  If not, I have to duplicate (at least) all of the learning simply
> for the benefit of the compiler.

Is this something you do with CM?
If not, could it be done with the CM's curent implementations of matrices
and vectors?


Regards,
Gilles

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [Math] "iterator" and "sparseIterator" in "RealVector" hierarchy

Reply via email to