Hi, my original problem is a dynammic programming problem in which I need to interpolate the value function on an irregular grid using a cubic spline method. I was translating my MATLAB code into Julia and used the Dierckx package in Julia to do the interpolation (there weren't some many alternatives that did spline on an irregular grid as far as I recall). In MATLAB I use interp1. It gave exactly the same result but it was about 50 times slower which puzzled me. So I made this (http://stackoverflow.com/questions/34766029/julia-vs-matlab-why-is-my-julia-code-so-slow) stackoverflow post.
The post is messy and you don't need to read through it I think. The bottom line was that the Dierckx package apparently calls some Fortran code which seems to pretty old (and slow, and doesn't use multiple cores. Nobody knows what exactly the interp1 is doing. My guess is that it's coded in C?!). So I asked a friend of mine who knows a little bit of C and he was so kind to help me out. He translated the interpolation method into C and made it such that it uses multiple threads (I am working with 12 threads). He also put it on github here (https://github.com/nuffe/mnspline). Equipped with that, I went back to my original problem and reran it. The Julia code was still 3 times slower which left me puzzled again. The interpolation itself was much faster now than MATLAB's interp1 but somewhere on the way that advantage was lost. Consider the following minimal working example preserving the irregular grid of the original problem which highlights the point I think (the only action is in the loop, the other stuff is just generating the irregular grid): MATLAB: spacing=1.5; Nxx = 300 ; Naa = 350; Nalal = 200; sigma = 10 ; NoIter = 10000; xx=NaN(Nxx,1); xmin = 0.01; xmax = 400; xx(1) = xmin; for i=2:Nxx xx(i) = xx(i-1) + (xmax-xx(i-1))/((Nxx-i+1)^spacing); end f_util = @(c) c.^(1-sigma)/(1-sigma); W=NaN(Nxx,1); W(:,1) = f_util(xx); W_temp = NaN(Nalal,Naa); xprime = NaN(Nalal,Naa); tic for banana=1:NoIter % tic xprime=ones(Nalal,Naa); W_temp=interp1(xx,W(:,1),xprime,'spline'); % toc end toc Julia: include("mnspline.jl") spacing=1.5 Nxx = 300 Naa = 350 Nalal = 200 sigma = 10 NoIter = 10000 xx=Array(Float64,Nxx) xmin = 0.01 xmax = 400 xx[1] = xmin for i=2:Nxx xx[i] = xx[i-1] + (xmax-xx[i-1])/((Nxx-i+1)^spacing) end f_util(c) = c.^(1-sigma)/(1-sigma) W=Array(Float64,Nxx,1) W[:,1] = f_util(xx) spl = mnspline(xx,W[:,1]) function performance(NoIter::Int64) W_temp = Array(Float64,Nalal*Naa) W_temp2 = Array(Float64,Nalal,Naa) xprime=Array(Float64,Nalal,Naa) for banana=1:NoIter xprime=ones(Nalal,Naa) W_temp = spl(xprime[:]) end W_temp2 = reshape(W_temp,Nalal,Naa) end @time performance() In the end I want to have a matrix, that's why I do all this reshaping in the Julia code. If I comment out the loop and just consider one iteration, Julia does it in (I ran it several times, precompiling etc) 0.000565 seconds (13 allocations: 1.603 MB) MATLAB on the other hand: Elapsed time is 0.007864 seconds. The bottom line being that in Julia the code is much faster (more than 10 times in this case), which should be since I use all my threads and the method is written in C. However, if I don't comment out the loop and run the code as posted above: Julia: 3.205262 seconds (118.99 k allocations: 15.651 GB, 14.08% gc time) MATLAB: Elapsed time is 4.514449 seconds. If I run the whole loop apparently MATLAB catches up a lot. It is still slower in this case. In the original problem though Julia was only about 3 times faster within the loop and once I looped through MATLAB turned out be 3 times faster. I am stuck here, does anybody have an idea what might be going? / If I am doing something wrong in my Julia code? A hint what I could do better? Right now I am trying to parallelize also the loop. But that's obviously unfair compared with MATLAB because you could also parallelize that (If you buy that expensive toolbox).
