Thanks, Dan! Indeed my "x" vector is sorted and your suggestion is really
fast!

Best,

Charles

On 30 December 2015 at 12:08, Dan <getz...@gmail.com> wrote:

> Thanks Kristoffer, turns out there is always interesting stuff in the bag
> of optimization tricks.
> Regarding the original function, a cheat could make it faster: The `x`
> vector is sorted, which means:
>
> function calcSum4(x::Array{Float64,1}, y::Array{Float64,1}, Ei::Float64,
> Ef::Float64, N::Int64)
>            mysum=0.0::Float64;
>            i=1
>            @inbounds while i<=N && x[i]<=Ei i+=1 ; end
>            j=i
>            @inbounds while j<=N && x[j]<=Ef j+=1 ; end
>            @inbounds @simd for k=i:(j-1) mysum += y[k] ; end
>            return(mysum);
>        end
>
> returns the same answer.
> there are always more options ;)
>
> On Wednesday, December 30, 2015 at 12:19:41 PM UTC+2, Charles Santana
> wrote:
>>
>> The magic of @inbounds and @simd :)
>>
>> Thanks, Kristoffer!
>>
>> Charles
>>
>>
>> On Wednesday, December 30, 2015, Kristoffer Carlsson <kcarl...@gmail.com>
>> wrote:
>>
>>> If you want to get an even faster version you could do something like:
>>>
>>> function calcSum_simd{T}(x::Vector{T}, y::Vector{T}, Ei::T, Ef::T)
>>>     mysum = zero(T)
>>>     @inbounds @simd for i in eachindex(x, y)
>>>          mysum += ifelse(Ei < x[i] <= Ef, y[i], zero(T))
>>>
>>>     end
>>>     return mysum
>>> end
>>>
>>> which would use SIMD instructions.
>>>
>>> Timing difference:
>>>
>>> N = 10000000
>>> y = rand(N);
>>> x = rand(N)
>>> Ei = 0.2;
>>> Ef = 0.7;
>>>
>>> julia> @time calcSum_simd(x,y,Ei, Ef);
>>>   0.021155 seconds (5 allocations: 176 bytes)
>>>
>>>
>>> julia> @time calcSum(x,y,Ei, Ef)
>>>   0.069911 seconds (5 allocations: 176 bytes)
>>>
>>>
>>> Regarding map being slow. That is worked on here
>>> https://github.com/JuliaLang/julia/pull/13412
>>>
>>>
>>> On Wednesday, December 30, 2015 at 3:05:47 AM UTC+1, Charles Santana
>>> wrote:
>>>>
>>>> Sorry, there was a typo in the function calcSum2. Please consider the
>>>> following code:
>>>>
>>>> function calcSum2(x::Array{Float64,1}, y::Array{Float64,1},
>>>> Ei::Float64, Ef::Float64, N::Int64)
>>>>
>>>>         return sum(y[map(v -> Ei < v <= Ef, x)]);
>>>> end
>>>>
>>>>
>>>> And so the results of the calls for this function change a bit (but not
>>>> the performance):
>>>>
>>>>         @time calcSum2(x,y,Ei,Ef,N)
>>>>           0.000110 seconds (1.01 k allocations: 20.969 KB)
>>>>         246.1975746121703
>>>>
>>>>         @time calcSum2(x,y,Ei,Ef,N)
>>>>           0.000079 seconds (1.01 k allocations: 20.969 KB)
>>>>         246.1975746121703
>>>>
>>>>         @time calcSum2(x,y,Ei,Ef,N)
>>>>           0.000051 seconds (1.01 k allocations: 20.969 KB)
>>>>         246.1975746121703
>>>>
>>>>
>>>> Thanks again, sorry for this inconvenience!
>>>>
>>>> Charles
>>>>
>>>> On 30 December 2015 at 03:00, Charles Novaes de Santana <
>>>> charles...@gmail.com> wrote:
>>>>
>>>>> Dear all,
>>>>>
>>>>> In a project I am developing a @profile shows me that the slowest part
>>>>> of the code is the sum of elements of an Array that follow some 
>>>>> conditions.
>>>>>
>>>>> Please consider the following code:
>>>>>
>>>>>         y = rand(1000);
>>>>>         x = collect(0.0:0.001:0.999);
>>>>>         Ei = 0.2;
>>>>>         Ef = 0.7;
>>>>>         N = length(x)
>>>>>
>>>>> I want to calculate the sum of elements in "y" for which elements the
>>>>> respective values in "x" are between "Ei" and "Ef". If I was using R, for
>>>>> example, I would use something like:
>>>>>
>>>>> mysum = sum(y[which((x < Ef)&&(x > Ei))]); #(not tested in R, but I
>>>>> suppose that is the way to do it)
>>>>>
>>>>> In Julia, I can think in at least two ways to calculate it:
>>>>>
>>>>> function calcSum(x::Array{Float64,1}, y::Array{Float64,1},
>>>>> Ei::Float64, Ef::Float64, N::Int64)
>>>>>         mysum=0.0::Float64;
>>>>>         for(i in 1:N)
>>>>>              if( Ei < x[i] <= Ef)
>>>>>                  mysum += y[i];
>>>>>              end
>>>>>         end
>>>>>         return(mysum);
>>>>> end
>>>>>
>>>>> function calcSum2(x::Array{Float64,1}, y::Array{Float64,1},
>>>>> Ei::Float64, Ef::Float64, N::Int64)
>>>>>         return sum(y[map(v -> Ei < v < Ef, x)]);
>>>>> end
>>>>>
>>>>> As you can see below, for the first function (calcSum) I got a much
>>>>> better performance than for the second one (minimum 10x faster).
>>>>>
>>>>>
>>>>>          @time calcSum(x,y,Ei,Ef,N)
>>>>>           0.003986 seconds (2.56 k allocations: 125.168 KB)
>>>>>         246.19757461217014
>>>>>
>>>>>         @time calcSum(x,y,Ei,Ef,N)
>>>>>           0.000003 seconds (5 allocations: 176 bytes)
>>>>>         246.19757461217014
>>>>>
>>>>>         @time calcSum(x,y,Ei,Ef,N)
>>>>>           0.000002 seconds (5 allocations: 176 bytes)
>>>>>         246.19757461217014
>>>>>
>>>>>         @time calcSum2(x,y,Ei,Ef,N)
>>>>>           0.003762 seconds (1.61 k allocations: 53.743 KB)
>>>>>         245.48156534879303
>>>>>
>>>>>         @time calcSum2(x,y,Ei,Ef,N)
>>>>>           0.000050 seconds (1.01 k allocations: 20.969 KB)
>>>>>         245.48156534879303
>>>>>
>>>>>         @time calcSum2(x,y,Ei,Ef,N)
>>>>>           0.000183 seconds (1.01 k allocations: 20.969 KB)
>>>>>         245.48156534879303
>>>>>
>>>>> Does any one have an idea about how to improve the performance here?
>>>>>
>>>>> Many thanks for any help! Happy new year to all of you!
>>>>>
>>>>> Charles
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Um axé! :)
>>>>>
>>>>> --
>>>>> Charles Novaes de Santana, PhD
>>>>> http://www.imedea.uib-csic.es/~charles
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Um axé! :)
>>>>
>>>> --
>>>> Charles Novaes de Santana, PhD
>>>> http://www.imedea.uib-csic.es/~charles
>>>>
>>>
>>
>> --
>> Um axé! :)
>>
>> --
>> Charles Novaes de Santana, PhD
>> http://www.imedea.uib-csic.es/~charles
>>
>>


-- 
Um axé! :)

--
Charles Novaes de Santana, PhD
http://www.imedea.uib-csic.es/~charles

Reply via email to