> > > I'm in favor of moving some methods to the SparseXXX interface, but > > > got voted down at the time. For application developers (like me), > > > that can expect in advance if the Vector/Matrix is sparse or not it > > > isn't a big deal. But I can see how it may cause problems for other > > > libraries that want to leverage C-M. And actually, having problems > > > seeing why it is a big deal in general. If I'm doing an operation > > > like outer product, I would still prefer that the iterator skips the > > > zero entries. > > > > I'm wondering whether, depending on the application field, one does not > > know > > in advance that one should use sparse implementations ("OpenMapRealVector" > > and "OpenMapRealMatrix"). If those objects are used consistently throughout > > the application, the operations should all leverage the sparseness > > optimizations, thanks to polymorphism. > > > > Polymorphism is really too limited here to get good results. > > And it really helps for a dense matrix to optimize operations against a > sparse operand as in dense.plus(sparse).
Applications that manipulate sparse things would know the limitations of polymorphism and would write sparse.plus(dense) > If all matrices implement isSparse > and sparseIterator, then the general implementation of plus here can have > one test and get almost all of the benefits of specially coded methods. One could consider that this comes at the expense of code clarity and a slightly decreased efficiency for basic (dense) usage. > Those tests are needed since it isn't reasonable to assume that the dense > implementation knows about all types of matrices if users are allowed to > implement matrices. Then, isn't there a problem in CM anyways because of those "instanceof" operators, in "AbstractRealVector" and others? A superclass testing the type of its subclasses does not feel right... > Moreover, a simple single inheritance type structure is > not rich enough to express the different kinds and conditions of a matrix. So why not multiple hierarchies instead of "one size fits all"? > I would say that the argument should go the other way. Do you have a unit > test demonstrating that having these methods in the general case *hurts* > performance? It obviously helps. > https://issues.apache.org/jira/browse/MATH-628 > > > Could there be specific unit tests demonstrating the necessity of having > > something like "Iterator<Entry> sparseIterator()" in "RealVector"? The > > drawback is obviously that in dense implementations, this method must be > > implemented but is used only in those cases where the object should have > > been "sparse" in the first place. > > > This "drawback" is trivial. In the abstract implementation, you can just > delegate to the dense iterator. Thus, RealVector never needs to know. In "AbstractRealVector", there is an implementation of "SparseEntryIterator" but the Javadoc states: "[...] Concrete subclasses which are SparseVector implementations should make their own sparse iterator, not use this one. [...]" That does not feel right either. > > > Unless I'm mistaken, it looks as if it > > would sufficient to have "converters" between sparse and dense > > implementations. > > > > I think you are mistaken. Converters are a royal pain. > > Taking a case near to my heart, it is really, really important for gradient > descent implementations of logistic regression to do smart things with > sparse feature vectors. On the other hand, they also need to handle dense > feature vectors. The internal coefficient matrix is, however, dense and > there are half a dozen or more built-in kinds of sparse matrices. There are > even several implementations of dense matrices. > > Almost all of the operations are actually on the internal coefficient > matrix. If the basic operations on a dense matrix all handle sparse cases > correctly, then the learning code can declare the feature vector to be a > Vector. If not, I have to duplicate (at least) all of the learning simply > for the benefit of the compiler. Is this something you do with CM? If not, could it be done with the CM's curent implementations of matrices and vectors? Regards, Gilles --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org