gt; To:Kazuaki Ishizaki/Japan/IBM@IBMJP, "dev@spark.apache.org
>> <mailto:dev@spark.apache.org>" > <mailto:dev@spark.apache.org>>, Joseph Bradley > <mailto:jos...@databricks.com>>
>> Cc:John Canny mailto:ca...@berkeley.edu>>,
&g
To:Kazuaki Ishizaki/Japan/IBM@IBMJP, "dev@spark.apache.org"
> , Joseph Bradley
> Cc:John Canny , "Evan R. Sparks"
> , Xiangrui Meng , Sam Halliday
>
> Date:2016/01/22 04:20
> Subject:RE: Using CUDA within Spark / boosting linear
ander"
To: Kazuaki Ishizaki/Japan/IBM@IBMJP, "dev@spark.apache.org"
, Joseph Bradley
Cc: John Canny , "Evan R. Sparks"
, Xiangrui Meng , Sam Halliday
Date: 2016/01/22 04:20
Subject: RE: Using CUDA within Spark / boosting linear algebra
Hi Kazuaki,
, "Ulanov, Alexander"
, "Joseph Bradley" ,
"John Canny" , "Evan R. Sparks"
, "Xiangrui Meng" , "Sam
Halliday"
Date: 2016/01/21 21:05
Subject:RE: Using CUDA within Spark / boosting linear algebra
Hi Kazuaki,
Jcuda
erformance.
Best regards, Alexander
From: Kazuaki Ishizaki [mailto:ishiz...@jp.ibm.com]
Sent: Thursday, January 21, 2016 3:34 AM
To: dev@spark.apache.org; Ulanov, Alexander; Joseph Bradley
Cc: John Canny; Evan R. Sparks; Xiangrui Meng; Sam Halliday
Subject: RE: Using CUDA within Spark / boosting
er 2
sheets):
https://docs.google.com/spreadsheets/d/1lWdVSuSragOobb0A_oeouQgHUMx378T9J5r7kwKSPkY/edit?usp=sharing
Benchmark code:
https://github.com/avulanov/scala-blas
Best regards, Alexander
From: Sam Halliday [mailto:sam.halli...@gmail.com]
Sent: Thursday, March 26, 2015 9:27 AM
To: John C
iding
binary column storage for data partition. We really appreciate it if you
would give us comments, suggestions, or feedback.
Best Regards
Kazuaki Ishizaki
From: "Ulanov, Alexander"
To: Sam Halliday , John Canny
Cc: Xiangrui Meng , "dev@spark.apache.org"
, J
...@gmail.com]
Sent: Thursday, March 26, 2015 9:27 AM
To: John Canny
Cc: Xiangrui Meng; dev@spark.apache.org; Joseph Bradley; Evan R. Sparks;
Ulanov, Alexander
Subject: Re: Using CUDA within Spark / boosting linear algebra
John, I have to disagree with you there. Dense matrices come up a lot in
industry
Sent: Wednesday, April 01, 2015 12:11 PM
To: Xiangrui Meng; Sean Owen
Cc: Evan R. Sparks; Sam Halliday; dev@spark.apache.org; jfcanny
Subject: RE: Using CUDA within Spark / boosting linear algebra
FYI, I've added instructions to Netlib-java wiki, Sam added the link to them
from the project's
gt; Cc: Evan R. Sparks; Sam Halliday; dev@spark.apache.org; Ulanov,
> Alexander; jfcanny
> > Subject: Re: Using CUDA within Spark / boosting linear algebra
> >
> > Hi Alex,
> >
> > Since it is non-trivial to make nvblas work with netlib-java, it would
> be great if
; Best regards, Alexander
> -Original Message-
> From: Xiangrui Meng [mailto:men...@gmail.com]
> Sent: Monday, March 30, 2015 2:43 PM
> To: Sean Owen
> Cc: Evan R. Sparks; Sam Halliday; dev@spark.apache.org; Ulanov, Alexander;
> jfcanny
> Subject: Re: Using CUDA within Spark / boosti
2:43 PM
To: Sean Owen
Cc: Evan R. Sparks; Sam Halliday; dev@spark.apache.org; Ulanov, Alexander;
jfcanny
Subject: Re: Using CUDA within Spark / boosting linear algebra
Hi Alex,
Since it is non-trivial to make nvblas work with netlib-java, it would be great
if you can send the instructions to netl
@spark.apache.org; Ulanov, Alexander;
jfcanny
Subject: Re: Using CUDA within Spark / boosting linear algebra
Hi Alex,
Since it is non-trivial to make nvblas work with netlib-java, it would be great
if you can send the instructions to netlib-java as part of the README.
Hopefully we don't need to m
Hi Alex,
Since it is non-trivial to make nvblas work with netlib-java, it would
be great if you can send the instructions to netlib-java as part of
the README. Hopefully we don't need to modify netlib-java code to use
nvblas.
Best,
Xiangrui
On Thu, Mar 26, 2015 at 9:54 AM, Sean Owen wrote:
> Th
The license issue is with libgfortran, rather than OpenBLAS.
(FWIW I am going through the motions to get OpenBLAS set up by default
on CDH in the near future, and the hard part is just handling
libgfortran.)
On Thu, Mar 26, 2015 at 4:07 PM, Evan R. Sparks wrote:
> Alright Sam - you are the exper
://docs.google.com/spreadsheets/d/1lWdVSuSragOobb0A_oeouQgHUMx378T9J5r7kwKSPkY/edit?usp=sharing
>>
>>
>>
>> -Original Message-
>> From: Ulanov, Alexander
>> Sent: Wednesday, March 25, 2015 2:31 PM
>> To: Sam Halliday
>> Cc: dev@spark.apache.org; Xi
-Original Message-
From: Ulanov, Alexander
Sent: Wednesday, March 25, 2015 2:31 PM
To: Sam Halliday
Cc: dev@spark.apache.org <mailto:dev@spark.apache.org>; Xiangrui
Meng; Joseph Bradley; Evan R. Sparks; jfcanny
Subject: RE: Using CUDA within Spark / boosting linear
performance of
>>>> different libraries. I just want to pick a library that does at best dense
>>>> matrices multiplication for my task.
>>>>
>>>> P.S. My previous issue with nvblas was the following: it has Fortran
>>>> blas functions, a
netlib+nvblas is on par
>>> with BIDMat-cuda. As promised, I am going to post a how-to for nvblas
>>> configuration.
>>>
>>>
>>> https://docs.google.com/spreadsheets/d/1lWdVSuSragOobb0A_oeouQgHUMx378T9J5r7kwKSPkY/edit?usp=sharing
>>>
>>>
vblas
>> configuration.
>>
>>
>> https://docs.google.com/spreadsheets/d/1lWdVSuSragOobb0A_oeouQgHUMx378T9J5r7kwKSPkY/edit?usp=sharing
>>
>>
>>
>> -Original Message-
>> From: Ulanov, Alexander
>> Sent: Wednesday, March 25, 2015 2:31 PM
>
rch 25, 2015 2:31 PM
> To: Sam Halliday
> Cc: dev@spark.apache.org; Xiangrui Meng; Joseph Bradley; Evan R. Sparks;
> jfcanny
> Subject: RE: Using CUDA within Spark / boosting linear algebra
>
> Hi again,
>
> I finally managed to use nvblas within Spark+netlib-java. It has
> except
adsheets/d/1lWdVSuSragOobb0A_oeouQgHUMx378T9J5r7kwKSPkY/edit?usp=sharing
-Original Message-
From: Ulanov, Alexander
Sent: Wednesday, March 25, 2015 2:31 PM
To: Sam Halliday
Cc: dev@spark.apache.org; Xiangrui Meng; Joseph Bradley; Evan R. Sparks; jfcanny
Subject: RE: Using CUDA within Spark / boosting linear algebra
Hi
Sent: Wednesday, March 25, 2015 3:09 PM
To: dev@spark.apache.org
Subject: Re: Using CUDA within Spark / boosting linear algebra
Alex,
I think you should recheck your numbers. Both BIDMat and nvblas are wrappers
for cublas. The speeds are identical, except on machines that have multiple
GPUs which n
Sure, I will write a how-to after I re-check the results.
-Original Message-
From: Sam Halliday [mailto:sam.halli...@gmail.com]
Sent: Wednesday, March 25, 2015 3:04 PM
To: Evan R. Sparks; dev@spark.apache.org
Subject: Re: Using CUDA within Spark / boosting linear algebra
If you write it
m Atlas or Openblas because they link to their
> implementation and not to Fortran blas.
>
> Best regards, Alexander
>
> -Original Message-
> From: Ulanov, Alexander
> Sent: Tuesday, March 24, 2015 6:57 PM
> To: Sam Halliday
> Cc: dev@spark.apache.org; Xiangrui Men
gt;
>>> *From:* Dmitriy Lyubimov [mailto:dlie...@gmail.com]
>>> *Sent:* Wednesday, March 25, 2015 2:55 PM
>>> *To:* Ulanov, Alexander
>>> *Cc:* Sam Halliday; dev@spark.apache.org; Xiangrui Meng; Joseph
>>> Bradley; Evan R. Sparks; jfcanny
>>> *Subje
needed to compile it. I
>> could not use cblas from Atlas or Openblas because they link to their
>> implementation and not to Fortran blas.
>>
>> Best regards, Alexander
>>
>> -Original Message-
>> From: Ulanov, Alexander
>>
>> Sent: Tuesda
gt; -Original Message-
> From: Ulanov, Alexander
> Sent: Tuesday, March 24, 2015 6:57 PM
> To: Sam Halliday
> Cc: [hidden email]; Xiangrui Meng; Joseph Bradley; Evan R. Sparks
> Subject: RE: Using CUDA within Spark / boosting linear algebra
>
> Hi,
>
> I am tr
Meng; Joseph Bradley;
> Evan R. Sparks; jfcanny
> *Subject:* Re: Using CUDA within Spark / boosting linear algebra
>
>
>
> Alexander,
>
>
>
> does using netlib imply that one cannot switch between CPU and GPU blas
> alternatives at will at the same time? the choice is
: Wednesday, March 25, 2015 2:55 PM
To: Ulanov, Alexander
Cc: Sam Halliday; dev@spark.apache.org; Xiangrui Meng; Joseph Bradley; Evan R.
Sparks; jfcanny
Subject: Re: Using CUDA within Spark / boosting linear algebra
Alexander,
does using netlib imply that one cannot switch between CPU and GPU blas
64.so
>> So I am sure that netlib-native is loaded and cblas supposedly used.
>> However, matrix multiplication does executes on CPU since I see 16% of CPU
>> used and 0% of GPU used. I also checked different matrix sizes, from
>> 100x100 to 12000x12000
>>
>> Coul
cblas from Atlas or Openblas because they link to their
> implementation and not to Fortran blas.
>
> Best regards, Alexander
>
> -Original Message-
> From: Ulanov, Alexander
> Sent: Tuesday, March 24, 2015 6:57 PM
> To: Sam Halliday
> Cc: dev@spark.apache.org; Xi
o their
> implementation and not to Fortran blas.
>
> Best regards, Alexander
>
> -Original Message-
> From: Ulanov, Alexander
> Sent: Tuesday, March 24, 2015 6:57 PM
> To: Sam Halliday
> Cc: dev@spark.apache.org; Xiangrui Meng; Joseph Bradley; Evan R. Sparks
> Subj
, Alexander
-Original Message-
From: Ulanov, Alexander
Sent: Tuesday, March 24, 2015 6:57 PM
To: Sam Halliday
Cc: dev@spark.apache.org; Xiangrui Meng; Joseph Bradley; Evan R. Sparks
Subject: RE: Using CUDA within Spark / boosting linear algebra
Hi,
I am trying to use nvblas with netlib
Subject: RE: Using CUDA within Spark / boosting linear algebra
Thanks so much for following up on this!
Hmm, I wonder if we should have a concerted effort to chart performance on
various pieces of hardware...
On 9 Mar 2015 21:08, "Ulanov, Alexander"
mailto:alexander.ula...@hp.com&g
Hi Reynold,
I left Chester with a copy of the slides, so I assume they'll be posted
on the SF ML or Big Data sites. We have a draft paper under review. I
can ask the co-authors about arxiv'ing it.
We have a few heuristics for power-law data. One of them is to keep the
feature set sorted by freq
Reyonld,
Prof Canny gives me the slides yesterday I will posted the link to the
slides to both SF BIg Analytics and SF Machine Learning meetups.
Chester
Sent from my iPad
On Mar 12, 2015, at 22:53, Reynold Xin wrote:
> Thanks for chiming in, John. I missed your meetup last night - do yo
Thanks for chiming in, John. I missed your meetup last night - do you have
any writeups or slides about roofline design? In particular, I'm curious
about what optimizations are available for power-law dense * sparse? (I
don't have any background in optimizations)
On Thu, Mar 12, 2015 at 8:50 PM,
If you're contemplating GPU acceleration in Spark, its important to look
beyond BLAS. Dense BLAS probably account for only 10% of the cycles in the
datasets we've tested in BIDMach, and we've tried to make them
representative of industry machine learning workloads. Unless you're
crunching images or
lli...@gmail.com sam.halli...@gmail.com>]
> Sent: Tuesday, March 03, 2015 1:54 PM
> To: Xiangrui Meng; Joseph Bradley
> Cc: Evan R. Sparks; Ulanov, Alexander; dev@spark.apache.org dev@spark.apache.org>
> Subject: Re: Using CUDA within Spark / boosting linear algebra
>
> BTW
appreciate any help
from you ☺
From: Sam Halliday [mailto:sam.halli...@gmail.com]
Sent: Monday, March 09, 2015 6:01 PM
To: Ulanov, Alexander
Cc: dev@spark.apache.org; Xiangrui Meng; Joseph Bradley; Evan R. Sparks
Subject: RE: Using CUDA within Spark / boosting linear algebra
Thanks so much for
12, 2015 at 4:18 PM, Ulanov, Alexander <
> >>> alexander.ula...@hp.com> wrote:
> >>>
> >>>> Just to summarize this thread, I was finally able to make all
> >>>> performance comparisons that we discussed. It turns out that:
> >>>&g
lanov, Alexander; dev@spark.apache.org
Subject: Re: Using CUDA within Spark / boosting linear algebra
BTW, is anybody on this list going to the London Meetup in a few weeks?
https://skillsmatter.com/meetups/6987-apache-spark-living-the-post-mapreduce-world#community
Would be nice to meet other p
>>>
>>>> Below is the link to the spreadsheet with full results.
>>>>
>>>> https://docs.google.com/spreadsheets/d/1lWdVSuSragOobb0A_oeouQgHUMx378T9J5r7kwKSPkY/edit?usp=sharing
>>>>
>>>> One thing still needs exploration: does BIDMat-cubla
lexander
>
> -Original Message-
> From: Xiangrui Meng [mailto:men...@gmail.com]
> Sent: Monday, March 02, 2015 11:42 AM
> To: Sam Halliday
> Cc: Joseph Bradley; Ulanov, Alexander; dev; Evan R. Sparks
> Subject: Re: Using CUDA within Spark / boosting linear algebra
>
&
Meng [mailto:men...@gmail.com]
Sent: Monday, March 02, 2015 11:42 AM
To: Sam Halliday
Cc: Joseph Bradley; Ulanov, Alexander; dev; Evan R. Sparks
Subject: Re: Using CUDA within Spark / boosting linear algebra
On Fri, Feb 27, 2015 at 12:33 PM, Sam Halliday wrote:
> Also, check the JNILoade
;> >> > DGEMM. Black line is the "cheating" time for the GPU and the green
>> >> > line
>> >> > is after copying the memory to/from the GPU memory. APUs have the
>> >> > potential to eliminate the green line.
>> >> >
>> >> >
; >> >>>> 1) A properly configured GPU matrix multiply implementation (e.g.
> >> >>>> BIDMat+GPU) can provide substantial (but less than an order of
> >> >>>> magnitude)
> >> >>>> benefit over a well-tune
) A poorly tuned CPU implementation can be 1-2 orders of magnitude
>> >>>> worse
>> >>>> than a well-tuned CPU implementation, particularly for larger
>> >>>> matrices.
>> >>>> (netlib-f2jblas or netlib-ref) This is not to pic
>> Xiangrui, I was also surprised that BIDMat-cuda was faster than
> netlib-cuda and the most reasonable explanation is that it holds the result
> in GPU memory, as Sam suggested. At the same time, it is OK because you can
> copy the result back from GPU only when needed. However, t
ask the developer of BIDMat on his upcoming talk.
>>
>>
>>
>> Best regards, Alexander
>>
>>
>> From: Sam Halliday [mailto:sam.halli...@gmail.com]
>> Sent: Thursday, February 26, 2015 1:56 PM
>> To: Xiangrui Meng
>> Cc: dev@spark.apache.org;
PM
> To: Xiangrui Meng
> Cc: dev@spark.apache.org; Joseph Bradley; Ulanov, Alexander; Evan R. Sparks
> Subject: Re: Using CUDA within Spark / boosting linear algebra
>
>
> Btw, I wish people would stop cheating when comparing CPU and GPU timings for
> things like matrix multipl
, Alexander <
>> >> alexander.ula...@hp.com> wrote:
>> >>
>> >>> Just to summarize this thread, I was finally able to make all
>> performance
>> >>> comparisons that we discussed. It turns out that:
>> >>> BIDMat-
Typo - CPU was 2.5 cheaper (not GPU!)
-Original Message-
From: Ulanov, Alexander
Sent: Thursday, February 26, 2015 2:01 PM
To: Sam Halliday; Xiangrui Meng
Cc: dev@spark.apache.org; Joseph Bradley; Evan R. Sparks
Subject: RE: Using CUDA within Spark / boosting linear algebra
Evan, thank
Sent: Thursday, February 26, 2015 1:56 PM
To: Xiangrui Meng
Cc: dev@spark.apache.org; Joseph Bradley; Ulanov, Alexander; Evan R. Sparks
Subject: Re: Using CUDA within Spark / boosting linear algebra
Btw, I wish people would stop cheating when comparing CPU and GPU timings for
things like m
rns out that:
> >>> BIDMat-cublas>>BIDMat
> >>>
> MKL==netlib-mkl==netlib-openblas-compiled>netlib-openblas-yum-repo==netlib-cublas>netlib-blas>f2jblas
> >>>
> >>> Below is the link to the spreadsheet with full results.
> >>>
ib-cublas>netlib-blas>f2jblas
> >>>
> >>> Below is the link to the spreadsheet with full results.
> >>>
> >>>
> https://docs.google.com/spreadsheets/d/1lWdVSuSragOobb0A_oeouQgHUMx378T9J5r7kwKSPkY/edit?usp=sharing
> >>>
> >>>
b-f2jblas |
>>> +---+
>>> |100x100*100x100 | 0,00205596 | 0,000381 | 0,03810324 | 0,002556 |
>>> |1000x1000*1000x1000 | 0,018320947 | 0,038316857 | 0,51803557
>>> |1,638475459 |
>>> |1x1000
ull results.
>>
>> https://docs.google.com/spreadsheets/d/1lWdVSuSragOobb0A_oeouQgHUMx378T9J5r7kwKSPkY/edit?usp=sharing
>>
>> One thing still needs exploration: does BIDMat-cublas perform copying
>> to/from machine’s RAM?
>>
>> -----Original Message-
&
t; my machine. Probably, I’ll add two more columns with locally compiled
> openblas and cuda.
>
> Alexander
>
> From: Evan R. Sparks [mailto:evan.spa...@gmail.com]
> Sent: Monday, February 09, 2015 6:06 PM
> To: Ulanov, Alexander
> Cc: Joseph Bradley; dev@spark.apache.or
y 10, 2015 2:12 PM
To: Evan R. Sparks
Cc: Joseph Bradley; dev@spark.apache.org
Subject: RE: Using CUDA within Spark / boosting linear algebra
Thanks, Evan! It seems that ticket was marked as duplicate though the original
one discusses slightly different topic. I was able to link netlib with M
: Monday, February 09, 2015 6:06 PM
To: Ulanov, Alexander
Cc: Joseph Bradley; dev@spark.apache.org
Subject: Re: Using CUDA within Spark / boosting linear algebra
Great - perhaps we can move this discussion off-list and onto a JIRA ticket?
(Here's one: https://issues.apache.org/jira/browse/SPARK-5705
ib-java)
> interested to compare their libraries.
>
>
>
> Best regards, Alexander
>
>
>
> *From:* Evan R. Sparks [mailto:evan.spa...@gmail.com]
> *Sent:* Friday, February 06, 2015 5:58 PM
>
> *To:* Ulanov, Alexander
> *Cc:* Joseph Bradley; dev@spark.apache.org
> *Sub
.
>
> Best regards, Alexander
>
> From: Evan R. Sparks [mailto:evan.spa...@gmail.com]
> Sent: Friday, February 06, 2015 5:58 PM
> To: Ulanov, Alexander
> Cc: Joseph Bradley; dev@spark.apache.org
> Subject: Re: Using CUDA within Spark / boosting linear algebra
>
> I would build
: Ulanov, Alexander
Cc: Joseph Bradley; dev@spark.apache.org
Subject: Re: Using CUDA within Spark / boosting linear algebra
I would build OpenBLAS yourself, since good BLAS performance comes from getting
cache sizes, etc. set up correctly for your particular hardware - this is often
a very tricky
las is so much slower
> than BIDMat MKL?
>
> Best regards, Alexander
>
> From: Joseph Bradley [mailto:jos...@databricks.com]
> Sent: Thursday, February 05, 2015 5:29 PM
> To: Ulanov, Alexander
> Cc: Evan R. Sparks; dev@spark.apache.org
> Subject: Re: Using CUDA within Spa
05, 2015 5:29 PM
To: Ulanov, Alexander
Cc: Evan R. Sparks; dev@spark.apache.org
Subject: Re: Using CUDA within Spark / boosting linear algebra
Hi Alexander,
Using GPUs with Spark would be very exciting. Small comment: Concerning your
question earlier about keeping data stored on the GPU rather
uppose that
> netlib is using it.
>
>
>
> *From:* Evan R. Sparks [mailto:evan.spa...@gmail.com]
> *Sent:* Friday, February 06, 2015 5:19 PM
> *To:* Ulanov, Alexander
> *Cc:* Joseph Bradley; dev@spark.apache.org
>
> *Subject:* Re: Using CUDA within Spark / boosting linear algeb
@spark.apache.org
Subject: Re: Using CUDA within Spark / boosting linear algebra
Getting breeze to pick up the right blas library is critical for performance. I
recommend using OpenBLAS (or MKL, if you already have it). It might make sense
to force BIDMat to use the same underlying BLAS library as well.
On Fri
der
>
> From: Joseph Bradley [mailto:jos...@databricks.com]
> Sent: Thursday, February 05, 2015 5:29 PM
> To: Ulanov, Alexander
> Cc: Evan R. Sparks; dev@spark.apache.org
> Subject: Re: Using CUDA within Spark / boosting linear algebra
>
> Hi Alexander,
>
> Using GPUs wit
>
> From: Evan R. Sparks [mailto:evan.spa...@gmail.com]
> Sent: Thursday, February 05, 2015 1:29 PM
> To: Ulanov, Alexander
> Cc: dev@spark.apache.org
> Subject: Re: Using CUDA within Spark / boosting linear algebra
>
> I'd be surprised of BIDMat+OpenBLAS was significantly
:29 PM
To: Ulanov, Alexander
Cc: dev@spark.apache.org
Subject: Re: Using CUDA within Spark / boosting linear algebra
I'd be surprised of BIDMat+OpenBLAS was significantly faster than
netlib-java+OpenBLAS, but if it is much faster it's probably due to data layout
and fewer levels of i
mailto:evan.spa...@gmail.com]
> *Sent:* Thursday, February 05, 2015 12:09 PM
> *To:* Ulanov, Alexander
> *Cc:* dev@spark.apache.org
> *Subject:* Re: Using CUDA within Spark / boosting linear algebra
>
>
>
> I'd expect that we can make GPU-accelerated BLAS faster than CPU bl
To: Ulanov, Alexander
Cc: dev@spark.apache.org
Subject: Re: Using CUDA within Spark / boosting linear algebra
I'd expect that we can make GPU-accelerated BLAS faster than CPU blas in many
cases.
You might consider taking a look at the codepaths that BIDMat
(https://github.com/BIDData/B
I'd expect that we can make GPU-accelerated BLAS faster than CPU blas in
many cases.
You might consider taking a look at the codepaths that BIDMat (
https://github.com/BIDData/BIDMat) takes and comparing them to
netlib-java/breeze. John Canny et. al. have done a bunch of work optimizing
to make th
75 matches
Mail list logo