My suggestions would be to replace
    for t in sort(collect(wc), by=x -> (-x.second, x.first)) 
        println(t.first, "\t", t.second)
    end

with
customlt(a,b) = (b.second < a.second) ? true : b.second == a.second ? a.first 
< b.first : false

function main()
    :
    :
    for t in sort(collect(wc), lt=customlt)
        println(t.first, "\t", t.second)
    end
end



On Monday, November 30, 2015 at 5:08:56 PM UTC+2, Dan wrote:
>
> Can you provide the comparable python code? Perhaps even the data used for 
> testing?
>
> Since you are evaluating Julia, there are two important points to remember:
> 1) In Julia because the language is fast enough to implement basic 
> functionality in Julia, then the distinction between Base Julia and 
> additional packages is small. Opting to use 'just' the core makes less 
> sense - the core is just a pre-compiled package.
> 2) The community is part of the language, so it should be regarded when 
> making considerations.
>
> On Monday, November 30, 2015 at 4:21:51 PM UTC+2, Attila Zséder wrote:
>>
>> Hi,
>>
>> Thank you all for the responses.
>>
>> 1. I tried simple profiling, but its output was difficult me to 
>> interpret, maybe if i put more time in it. I will try ProfileView later.
>> 2. FastAnonymous gave me a serious speedup (20-30%). (But since it is an 
>> external dependency, it's kind of cheating, seeing the purpose of this 
>> small word count test)
>> 3. Using ASCIIString is not a good option right now, since there are 
>> unicode characters there. I am trying with both UTF8String and 
>> AbstractString, I don't see any difference in performance right now.
>> 4. Using ht_keyindex() is out of scope for me right now, because this is 
>> a pet project, I just wanted to see how fast current implementation is, 
>> without these kind of solutions.
>>
>> I think I will keep trying with later versions of julia, but with 
>> sticking to the standard library only, without using any external packages.
>>
>> Attila
>>
>> 2015. november 29., vasárnap 17:59:42 UTC+1 időpontban Yichao Yu a 
>> következőt írta:
>>>
>>> On Sun, Nov 29, 2015 at 11:42 AM, Milan Bouchet-Valat <nali...@club.fr> 
>>> wrote: 
>>> > Le dimanche 29 novembre 2015 à 08:28 -0800, Cedric St-Jean a écrit : 
>>> >> What I would try: 
>>> >> 
>>> >> 1. ProfileView to pinpoint the bottleneck further 
>>> >> 2. FastAnonymous to fix the lambda 
>>> >> 3. http://julia-demo.readthedocs.org/en/latest/manual/performance-tip 
>>> >> s.html In particular, you may check `code_typed`. I don't have 
>>> >> experience with `split` and `eachline`. It's possible that they are 
>>> >> not type stable (the compiler can't predict their output's type). I 
>>> >> would try `for w::ASCIIString in ...` 
>>> >> 4. Dict{ASCIIString, Int}() 
>>> >> 5. Your loop will hash each string twice. I don't know how to fix 
>>> >> that, anyone? 
>>> > You can use the unexported Base.ht_keyindex() function like this: 
>>> > 
>>> https://github.com/nalimilan/FreqTables.jl/blob/7884c000e6797d7ec621e07 
>>> > b8da58e7939e39867/src/freqtable.jl#L36 
>>> > 
>>> > But this is at your own risk, as it may change without warning in a 
>>> > future Julia release. 
>>> > 
>>> > We really need a public API for it. 
>>>
>>> IIUC, https://github.com/JuliaLang/julia/issues/12157 
>>>
>>> > 
>>> > 
>>> > Regards 
>>> > 
>>> >> 
>>> >> Good luck, 
>>> >> 
>>> >> Cédric 
>>> >> 
>>> >> On Saturday, November 28, 2015 at 8:08:49 PM UTC-5, Lampkld wrote: 
>>> >> > Maybe it's the lambda? These are slow in julia right now. 
>>>
>>

Reply via email to