Le 26/08/2011 07:45, Phil Steitz a écrit :
On 8/25/11 8:11 AM, Ted Dunning wrote:
> 2011/8/25 Sébastien Brisard<sebastien.bris...@m4x.org>
>
>> Hi Ted,
>>
>>> You missed my suggestion.
>>>
>> are you referring to the thread named "New method: "addToEntry" in
>> "RealVector""?
>> If yes, I initially thought that what you suggested was pretty similar
>> to the Visitor approach.
>> Thanks to the above message, I now understand that your suggestion is
>> the Visitor approach PLUS the ability to provide different "views" of
>> the same matrix/vector.
>> Is that correct?
>
> Yes. Views are critical to the suggestion because they control where the
> visitor goes.
>
> Allowing more general visitors than a pure element function is probably a
> great idea.  It is not uncommon to need the indexes of the element.
>
>
>> Should I dive into Mahout to understand more on this topic?
>>
> Your choice.
>
> Here is some (very) recent code that makes use of these capabilities:
>
> https://github.com/tdunning/mahout/blob/new-stochastic/math/src/main/java/org/apache/mahout/math/CholeskyDecomposition.java
>
> https://github.com/tdunning/mahout/blob/new-stochastic/math/src/main/java/org/apache/mahout/math/ssvd/SequentialBigSvd.java
>

Thanks, Ted.  That does look very flexible and approachable too.  I
am sorry to flip-flop on this issue; but I am now thinking it might
actually be better to replace the visitor setup that we have with
something like the above, partly due to Greg's comments as well on
the limitations of the current code.  I encourage others to have a
look at the Mahout code and consider the pros and cons of
refactoring.  I don't think the visitor machinery is really used
internally, so refactoring would not be cataclysmic.  Now is the
time to do it if we want to go to a model based more on views and
the functional approach.

I like the view approach, but wonder how it scales ... down for small data. If you remember Yannick's concerns, the problems he addresses (and the one I address too) are millions of computations on tiny matrices and vectors (3x3, 6x6) rather than few operations on very large data sets (say decomposing a 50000x50000 matrix). I would like Apache Commons Math to address both cases. For now, I think we are quite bad on large system, and especially on sparse systems. So we need to improve, but still be good for small systems.

Could we basically copy Mahout code ? Ted, what would you think about this ?

Luc


Phil

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Reply via email to