On 10/15/11 5:41 AM, Gilles Sadowski wrote:
> Hi.
>
>> first of all, I was the author of this very usefull statement on
>> factories... Very constructive indeed.
> Liking something or not is an impression that could well be justified
> afterwards. It also pushes to look for arguments that ascertain the
> feeling. ;-) 
>
>>> However it also shows that the improvement is only ~13% instead of the ~30%
>>> reported by the benchmark in the paper...
>>>
>> could it be that their "naive" implementation as a 2D array is very
>> naive indeed? I notice in the listings provided in the paper that they
>> constantly refer to a[i][j]. I think the strength of having a row
>> representation is to define a temporary variable ai = a[i], and access
>> to a[i][j] as ai[j]. That's what is done in CM anyway, maybe that
>> explains why the gain is not so big in the end.
> You are right; the "naïve" code repeatedly access a[i][j].
>
> But this alone doesn't make up for the difference (cf. table below).
>
> operate (calls per timed block: 10000, timed blocks: 100, time unit: ms)
>            name      time/call      std error total time      ratio difference
>    Commons Math 1.19770542e-01 2.85011660e-04 1.1977e+05 1.0000e+00 
> 0.00000000e+00
> OpenGamma naive 1.23798907e-01 4.01495625e-04 1.2380e+05 1.0336e+00 
> 4.02836495e+03
>    OpenGamma 1D 1.04352827e-01 2.08970600e-04 1.0435e+05 8.7127e-01 
> -1.54177153e+04
>    OpenGamma 2D 1.12666770e-01 3.50012912e-04 1.1267e+05 9.4069e-01 
> -7.10377213e+03
>
>
>>> I don't think that CM development should be focused on performance
>>> improvements that are so sensitive to the actual hardware (if it's indeed
>>> the varying amount of CPU cache that is responsible for this discrepancy).
>>>
>> That would apparently require fine tuning indeed, just like BLAS
>> itself, which has -I believe- specific implementations for specific
>> architectures. So it's a bit going against the philosophy of Java. I
>> wonder how a JNI interface to BLAS would perform ? That would leave
>> the architecture specific issues out of the Java code (which could
>> even provide a basic implementation of basic linear algebra operations
>> if people do not want to use native code.
> The author of the paper proposes to indeed clone the BLAS tuning
> methodology.
> However, I don't think that this should be a priority for CM (as a
> general-purpose math toolbox).
>
>>> If there are (human) resources inclined to rewrite CM algorithms in order to
>>> boost performance, I'd suggest to also explore the multi-threading route, as
>>> I feel that the type of optimizations described in this paper are more in 
>>> the
>>> realm of the JVM itself.
>>>
>> I would be very interested, but know nothing on multi-threading. I
>> will need to explore multi-threading for work anyway, so maybe in the
>> future?

Any references to specific optimizations or algorithm improvements here?
> Yes, 3.1, 3.2, ... , 4.0, ... whatever.
>
>> In the meantime, may I bring to you attention the JTransforms
>> library? (http://sites.google.com/site/piotrwendykier/Home)
>> It's a multi-threaded library for various FFT calculations. I've used
>> it a lot, and have been involved in the correction of some bugs. I've
>> never benchmarked it against CM, but the site claims (if my memory
>> does not fail me) greater performance.
> Yes, I did not perform benchmarks; however, Luc already pointed out that he
> had not pay particular attention to the speed efficiency of the code in CM.

I don't think Luc meant to make a broad general statement there. 
IIRC, he was talking about one matrix representation class.  Lets
focus on specific problems and solutions.

Phil
> Also, there are other problems, cf. issue
>   https://issues.apache.org/jira/browse/MATH-67
>
>> Also it can handle
>> non-power-of-two array dimensions. Plus, the author seems to have no
>> longer time to spend on this library, and may be willing to share it
>> with CM. That would be a first step in the multi-threading realm.
> Unfortunately, no; he doesn't want to donate his code.
>
>> Beware, though; the basic code is a direct translation of C code, and
>> is sometimes difficult to read (thousands of lines, with loads of
>> branching: code coverage analysis was simply a nightmare!).
> So, the above information is only half bad news! ;-)
>
>
> Best,
> Gilles
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Reply via email to