> On 17 Feb 2015, at 11:06, Clément Bera <bera.clem...@gmail.com> wrote: > > Hello Andrea, > > The way you wrote you algorithm is nice but makes extensive use of closures > and iterates a lot over collections.
I was about to say the same thing. > Those are two aspects where the performance of Pharo have issues. Eliot > Miranda and myself are working especially on those 2 cases to improve Pharo > performance. If you don't mind, I will add your algorithm to the benchmarks > we use because it really makes extensive uses of cases we are trying to > optimize so its results on the bleeding edge VM are very encouraging. > > > About your implementation, someone familiar with Pharo may change > #timesRepeat: by #to:do: in the 2 places you use it. > > For example: > run: points times: times > 1 to: times do: [ :i | self run: points ]. > > I don't believe it makes it really harder to read but depending on the number > of times you're using, it may show some real improvements because #to:do: is > optimized at compile-time, though I tried and I got a -15% on the overall > time to run only in the bleeding edge VM. That is a lot of difference for such a small change. > Another thing is that #groupedBy: is almost never used in the system and it's > really *not* optimized at all. Maybe another collection protocol is better > and not less readable, I don't know. > > > Now about solutions: > > Firstly, the VM is getting faster. > The Pharo 4 VM, to be released in July 2015, should be at least 2x > faster than now. I tried it on your benchmark, and I got 5352.7 instead of > 22629.1 on my machine, which is over x4 performance boost, and which put > Pharo between factor and clojure performance. Super. Thank you, Esteban and of course Eliot for such great work, eventually we'll all be better off thanks to these improvements. > An alpha release is available here: > https://ci.inria.fr/pharo/view/4.0-VM-Spur/ . You need to use PharoVM-spur32 > as a VM and Pharo-spur32 as an image (Yes, the image changed too). You should > be able to load your code, try your benchmark and have a similar result. I did a quick test (first time I tried Spur) and code loading was spectacularly fast. But the ride is still rough ;-) > In addition, we're working on making the VM again much faster on > benchmarks like yours in Pharo 5. We hope to have an alpha release this > summer but we don't know if it will be ready for sure. For this second step, > I'm at a point where I can barely run a bench without a crash, so I can't > tell right now the exact performance you can expect, but except if there's a > miracle it should be somewhere between pypy and scala performance (It'll > reach full performance once it gets more mature and not at first release > anyway). Now I don't think we'll reach any time soon the performance of > languages such as nim or rust. They're very different from Pharo: direct > compilation to machine code, many low level types, ... I'm not even sure a > Java implementation could compete with them. > > Secondly, you can use bindings to native code instead. I showed here how to > write the code in C and bind it with a simple callout, which may be what you > need for your bench: > https://clementbera.wordpress.com/2013/06/19/optimizing-pharo-to-c-speed-with-nativeboost-ffi/ > . Now this way of calling C does not work on the latest VM. There are 3 > existing frameworks to call C from Pharo, all having pros and cons, we're > trying to unify them but it's taking time. I believe for the July release of > Pharo 4 there will be an official recommended way of calling C and that's the > one you should use. > > > I hope I wrote you a satisfying answer :-). I'm glad some people are deeply > interested in Pharo performance. > > Best, > > Clement > > > > 2015-02-17 9:03 GMT+01:00 Andrea Ferretti <ferrettiand...@gmail.com>: > Hi, a while ago I was evaluating Pharo as a platform for interactive > data exploration, mining and visualization. > > I was fairly impressed by the tools offered by the Pharo distribution, > but I had a general feeling that the platform was a little slow, so I > decided to set up a small benchmark, given by an implementation of > K-means. > > The original intention was to compare Pharo to Python (a language that > is often used in this niche) and Scala (the language that we use in > production), but since then I have implemented a few other languages > as well. You can find the benchmark here > > https://github.com/andreaferretti/kmeans > > Unfortunately, it turns out that Pharo is indeed the slowest among the > implementations that I have tried. Since I am not an expert on Pharo > or Smalltalk in general, I am asking advice here to find out if maybe > I am doing something stupid. > > To be clear: the aim is *not* to have an optimized version of Kmeans. > There are various ways to improve the algorithm that I am using, but I > am trying to get a feeling for the performance of an algorithm that a > casual user could implement without much thought while exploring some > data. So I am not looking for: > > - better algorithms > - clever optimizations, such as, say, invoking native code > > I am asking here because there is the real possibility that I am just > messing something up, and the same naive algorithm, written by someone > more competent, would show real improvements. > > Please, let me know if you find anything > Best, > Andrea > >