My suggestions would be to replace for t in sort(collect(wc), by=x -> (-x.second, x.first)) println(t.first, "\t", t.second) end
with customlt(a,b) = (b.second < a.second) ? true : b.second == a.second ? a.first < b.first : false function main() : : for t in sort(collect(wc), lt=customlt) println(t.first, "\t", t.second) end end On Monday, November 30, 2015 at 5:08:56 PM UTC+2, Dan wrote: > > Can you provide the comparable python code? Perhaps even the data used for > testing? > > Since you are evaluating Julia, there are two important points to remember: > 1) In Julia because the language is fast enough to implement basic > functionality in Julia, then the distinction between Base Julia and > additional packages is small. Opting to use 'just' the core makes less > sense - the core is just a pre-compiled package. > 2) The community is part of the language, so it should be regarded when > making considerations. > > On Monday, November 30, 2015 at 4:21:51 PM UTC+2, Attila Zséder wrote: >> >> Hi, >> >> Thank you all for the responses. >> >> 1. I tried simple profiling, but its output was difficult me to >> interpret, maybe if i put more time in it. I will try ProfileView later. >> 2. FastAnonymous gave me a serious speedup (20-30%). (But since it is an >> external dependency, it's kind of cheating, seeing the purpose of this >> small word count test) >> 3. Using ASCIIString is not a good option right now, since there are >> unicode characters there. I am trying with both UTF8String and >> AbstractString, I don't see any difference in performance right now. >> 4. Using ht_keyindex() is out of scope for me right now, because this is >> a pet project, I just wanted to see how fast current implementation is, >> without these kind of solutions. >> >> I think I will keep trying with later versions of julia, but with >> sticking to the standard library only, without using any external packages. >> >> Attila >> >> 2015. november 29., vasárnap 17:59:42 UTC+1 időpontban Yichao Yu a >> következőt írta: >>> >>> On Sun, Nov 29, 2015 at 11:42 AM, Milan Bouchet-Valat <nali...@club.fr> >>> wrote: >>> > Le dimanche 29 novembre 2015 à 08:28 -0800, Cedric St-Jean a écrit : >>> >> What I would try: >>> >> >>> >> 1. ProfileView to pinpoint the bottleneck further >>> >> 2. FastAnonymous to fix the lambda >>> >> 3. http://julia-demo.readthedocs.org/en/latest/manual/performance-tip >>> >> s.html In particular, you may check `code_typed`. I don't have >>> >> experience with `split` and `eachline`. It's possible that they are >>> >> not type stable (the compiler can't predict their output's type). I >>> >> would try `for w::ASCIIString in ...` >>> >> 4. Dict{ASCIIString, Int}() >>> >> 5. Your loop will hash each string twice. I don't know how to fix >>> >> that, anyone? >>> > You can use the unexported Base.ht_keyindex() function like this: >>> > >>> https://github.com/nalimilan/FreqTables.jl/blob/7884c000e6797d7ec621e07 >>> > b8da58e7939e39867/src/freqtable.jl#L36 >>> > >>> > But this is at your own risk, as it may change without warning in a >>> > future Julia release. >>> > >>> > We really need a public API for it. >>> >>> IIUC, https://github.com/JuliaLang/julia/issues/12157 >>> >>> > >>> > >>> > Regards >>> > >>> >> >>> >> Good luck, >>> >> >>> >> Cédric >>> >> >>> >> On Saturday, November 28, 2015 at 8:08:49 PM UTC-5, Lampkld wrote: >>> >> > Maybe it's the lambda? These are slow in julia right now. >>> >>