Either you have type-instability (you've checked, I hope), or you're doing the
computation in different ways. If you've ruled out the former, it must be the
latter---if you don't have type instability, a performance gap like that is
user error, pure and simple.
Look into whether a loop is being run more frequently in your Julia code than
it is in the equivalent vectorized computation.
--Tim
On Tuesday, January 27, 2015 08:08:08 AM Yuuki Soho wrote:
> Well, my matlab code wasn't very good, I did a fully vectorized version,
> and it's about the same speed than Julia.
>
>
> I'm playing now with a problem that is easier to vectorize and there the
> matlab version is 300 times faster than Julia, which is a bit crazy.
>
> Maybe I messed something up again, that's really a big difference. I've
> check that the output is identical, so at least I'm computing the same
> thing.
>
>
> Julia:
>
> -
> 1. A = rand(50,40,30);
> 2. B = rand(50,50);
> 3. C = rand(40,40);
> 4. D = rand(30,30);
> 5. E = rand(50,40,30);
> 6. alpha = zeros(size(A))
> 7.
> 8. function testSum!(A::Array{Float64,3},B::Array{Float64,2},C::Array{
> Float64,2},
> 9. D::Array{Float64,2},E::Array{Float64,3},alpha::
> Array{Float64,3})
> 10.
> 11. tmp_1 = zero(Float64)
> 12. tmp_2 = zero(Float64)
> 13. tmp_3 = zero(Float64)
> 14.
> 15. @inbounds begin
> 16. for x_3p = 1:30
> 17. for x_2p = 1:40
> 18. for x_1p = 1:50
> 19. tmp_3 = zero(tmp_3)
> 20. for x_3 = 1:30
> 21. tmp_2 = zero(tmp_2)
> 22. for x_2 = 1:40
> 23. tmp_1 = zero(tmp_1)
> 24. @simd for x_1 = 1:50
> 25. tmp_1 += A[x_1,x_2,x_3] * B[x_1,x_1p]
>
> 26. end
> 27. tmp_2 += tmp_1 * C[x_2,x_2p]
> 28. end
> 29. tmp_3 += tmp_2 * D[x_3,x_3p]
> 30. end
> 31. alpha[x_1p,x_2p,x_3p] = E[x_1p,x_2p,x_3p] * tmp_3
> 32. end
> 33. end
> 34. end
> 35. end
> 36. end
>
> Matlab:
>
>
> function alpha = test2(A,B,C,D,E)
>
>
>
> alpha = multiprod(multiprod(multiprod(B', A,[1 2],[1 2]),C,[1 2],[1
> 2]),D,[2 3],[1 2]).*E;
>
> end