*Re: performance measurement framework*
We (Databricks) used to use spark-perf
<https://github.com/databricks/spark-perf>, but that was mainly for the
RDD-based API.  We've now switched to spark-sql-perf
<https://github.com/databricks/spark-sql-perf>, which does include some ML
benchmarks despite the project name.  I'll see about updating the project
README to document how to run MLlib tests.


On Tue, Jan 24, 2017 at 6:02 PM, bradc <brad.carl...@oracle.com> wrote:

> I believe one of the higher level goals of Spark MLlib should be to
> improve the efficiency of the ML algorithms that already exist. Currently
> there ML has a reasonable coverage of the important core algorithms. The
> work to get to feature parity for DataFrame-based API and model persistence
> are also important.
>
> Apache Spark needs to use higher-level BLAS3 and LAPACK routines, instead
> of BLAS1 & BLAS3. For a long time we've used the concept of compute
> intensity (compute_intensity = FP_operations/Word) to help look at the
> performance of the underling compute kernels (see the papers referenced
> below). It has been proven in many implementations that performance,
> scalability, and huge reduction in memory pressure can be achieved by using
> higher-level BLAS3 or LAPACK routines in both single node as well as
> distributed computations.
>
> I performed a survey of some of Apache Spark's ML algorithms.
> Unfortunately most of the ML algorithms are implemented with BLAS1 or BLAS2
> routines which have very low compute intensity. BLAS2 and BLAS1 routines
> require a lot more memory bandwidth and will not achieve peak performance
> on x86, GPUs, or any other processor.
>
> Apache Spark 2.1.0 ML routines & BLAS Routines
>
> ALS(Alternating Least Squares matrix factorization
>
>    - BLAS2: _SPR, _TPSV
>    - BLAS1: _AXPY, _DOT, _SCAL, _NRM2
>
> Logistic regression classification
>
>    - BLAS2: _GEMV
>    - BLAS1: _DOT, _SCAL
>
> Generalized linear regression
>
>    - BLAS1: _DOT
>
> Gradient-boosted tree regression
>
>    - BLAS1: _DOT
>
> GraphX SVD++
>
>    - BLAS1: _AXPY, _DOT,_SCAL
>
> Neural Net Multi-layer Perceptron
>
>    - BLAS3: _GEMM
>    - BLAS2: _GEMV
>
> Only the Neural Net Multi-layer Perceptron uses BLAS3 matrix multiply
> (DGEMM). BTW the underscores are replaced by S, D, Z, C for (32-bit real,
> 64-bit double, 32-bit complex, 64-bit complex operations; respectably).
>
> Refactoring the algorithms to use BLAS3 routines or higher level LAPACK
> routines will require coding changes to use sub-block algorithms but the
> performance benefits can be great.
>
> More at: https://blogs.oracle.com/BestPerf/entry/improving_
> algorithms_in_spark_ml
> Background:
>
> Brad Carlile. Parallelism, compute intensity, and data vectorization.
> SuperComputing'93, November 1993.
> <https://blogs.oracle.com/BestPerf/resource/Carlile-app_compute-intensity-1993.pdf>
>
> John McCalpin. 213876927_Memory_Bandwidth_and_Machine_Balance_in_
> Current_High_Performance_Computers 1995
> <https://www.researchgate.net/publication/213876927_Memory_Bandwidth_and_Machine_Balance_in_Current_High_Performance_Computers>
>
> ------------------------------
> View this message in context: Re: MLlib mission and goals
> <http://apache-spark-developers-list.1001551.n3.nabble.com/MLlib-mission-and-goals-tp20715p20754.html>
> Sent from the Apache Spark Developers List mailing list archive
> <http://apache-spark-developers-list.1001551.n3.nabble.com/> at
> Nabble.com.
>
>


-- 

Joseph Bradley

Software Engineer - Machine Learning

Databricks, Inc.

[image: http://databricks.com] <http://databricks.com/>

Reply via email to