That's a great notebook. On Sun, Mar 22, 2015 at 3:59 PM, Simon Danisch <sdani...@gmail.com> wrote:
> With @simd and @inbounds you can halve the time (At least on my machine > with Julia 0.4). > Here is a very nice article that explains what blas actually does and what > Julia doesn't do: > > http://nbviewer.ipython.org/url/math.mit.edu/~stevenj/18.335/Matrix-multiplication-experiments.ipynb > > Best, > Simon > Am Sonntag, 22. März 2015 14:44:19 UTC+1 schrieb Uliano Guerrini: > >> First look at Julia, I read somewhere that it is advised to de-vectorize >> code so I just tried this: >> >> function matmul(a,b) >> c=zeros(typeof(a[1,1]),(size(a,1),size(b,2))) >> for j = 1:size(b,2) >> for i =1:size(a,1) >> for k = 1:size(b,1) >> c[i,j]+=a[i,k]*b[k,j] >> end >> end >> end >> c >> end >> >> >> function matmul2(a,b) >> a*b >> end >> >> >> a=rand(2,3); >> b=rand(3,4); >> c=matmul(a,b); #just to make the JIT >> c1=matmul2(a,b); #compile the functions ahed of @time >> a=rand(6000,500); >> b=rand(500,8000); >> @time(matmul(a,b);) >> @time(matmul2(a,b);) >> >> >> >> and I got that: >> >> elapsed time: 150.661463517 seconds (384000192 bytes allocated) >> elapsed time: 0.990317124 seconds (384000192 bytes allocated) >> >> >> the code for matrix multiplication I assume is some kind of BLAS maybe in >> fortran (or assembler?) maybe optimized for SSE2, for sure using all my 4 >> cores so this is not the typical example where de-vectorizing is advisable... >> >> >> nonetheless, isn't it a factor of 150 a bit higher than expected? I missed >> something important in the matmul code? >> >>