I already use breeze, actually my current impl of sqDist uses it: https://github.com/danielkorzekwa/bayes-scala-gp/blob/master/src/main/scala/dk/gp/math/sqDist.scala
still 3 times slower that sq_dist from gpml thanks for BID Data Project info On 9 September 2015 at 18:45, Dmitriy Lyubimov <[email protected]> wrote: > Hi Daniel, > > you mean, for dense algebra single-threaded java vs. cache, multithreaded, > SSE4-optimized Intel MKL? I am actually surprised it is not at least 10x. > > Mahout focuses on ease of distributed implementations (i.e. dsq_dist > variant of the routine) but has been somewhat lazy on marrying mahout-math > with hardware-optimized in-core libraries. That much is true. > > The things that somewhat downplayed priority of in-core cpu-bound algebra > optimizations were: > > (1) distributed operations multithreading plays significantly smaller role > (well-behaved tasks should assume they are allocted only 1 core and rely on > resource manager to allocate cpu resources) > (2) for distrubuted algorithms, unless they are naive power-law ports of > in-core algorithms, I/O and data serialization expenses start to play a > significant role in overall algorithm performance compared to shared-memory > single-machine algorithms. > (3) a lot of algorithms require non-blas kernel operators anyway > (4) most importantly, standard BLAS is somewhat unsatisfactory in the > sparse algebra department, I would seek a better solution than just BLAS > API. There are some emerging technologies that are sparse/dense balanced > libraries, but the jury is still out as to what best pathway here is. Or > maybe, the best path is to do what Teano and BidMat did, i.e. developing > new set of algebraic kernel routines, but that's probably too heavy for > this project at the moment. > > If you need a good cpu-bound shared-memory environment for dense algebra, > i'd suggest to try either Breeze or BidMat. Perhaps even the latter as it > does support sparse subroutines, somewhat anyway, and also has GPU-enabled > set of matrix implementations. > > On Wed, Sep 9, 2015 at 12:21 AM, Daniel Korzekwa < > [email protected]> > wrote: > > > Hello, > > > > I'm comparing the efficiency of sq_dist() from mahout to sq_dist() from > > gpml library that is based on bsxfun in octave/matlab. > > > > It seems that computing the distance matrix in octave is 5 times faster > > than in Mahout.Why is that? Can we make it faster? > > > > Octave: > > x = [1:4000] > > sq_dist(x) > > > > Scala (Mahout): > > val x = Array.range(1,4000,1).map(i => i.toDouble) > > val A = new DenseMatrix(Array(x)).transpose() > > val dM = sqDist(A) > > > > -- > > Daniel Korzekwa > > Machine Learning Engineer > > https://www.linkedin.com/in/danielkorzekwa <http://danmachine.com/> > > > -- Daniel Korzekwa Machine Learning Engineer https://www.linkedin.com/in/danielkorzekwa <http://danmachine.com/>
