I hoped at least to stimulate interest in repeating GP experiment with latest GHC version.

until that happens, I'd be wary to draw too many conclusions for today's applications from this paper. two orders of magnitude difference would seem
to imply programming problems to me (though the authors did have the problem/
excuse of lack of profiling tools).

(the experiment:
http://www.dcs.ed.ac.uk/home/stg/sfpw/book/Garmendia-Doval/cameraready.ps)

not only is it old (ghc-4.08), it doesn't show much code, and what it shows,
looks suspiciously like a case of ML-mindset programming in Haskell (the kind
that people get help with on the various Haskell lists nearly every day, 
nowadays).

according to 3.5, the Haskell version came first, was used for experiments, and
errors in design were corrected, then came the ML version, which was at this 
point
already substantially faster (from which i would be tempted to conclude that the
coding style was a better fit for ML than for Haskell; cf also 3.5.1, third para). the ML version was then profiled and improved, after which these improvements were backported from the ML to the Haskell version! okay, profiling was not available for the Haskell version back then, but using ML profiling to improve a Haskell version sounds highly dangerous to me, even more so if the authors do not even mention any awareness of this danger. in 3.5.1, we see Alleles as a BoolVector, which sounds fine until we see it converted to its list of associations, which is then foldl-ed (!) over with a non-strict function (good that those chromosome lengths appear to be only 60..), for every evaluation. this is the main evaluation function for fitness, so it should be very much inner loop, with population sizes and generations ranging to 7500/700 and 50/150000.
.
of course, the same function in the ML version just means running a loop over a
vector, with a strict accumulator..

that is just the code shown in the paper - and it is the only piece of code, 
apart
from structure declarations. perhaps inlining and strictness analysis caught 
this
particular problem, and perhaps calling out to C for random numbers didn't slow
down the program, either - the paper just doesn't give enough information to 
tell.

"So far, we have found the Standard ML version to out-perform the Haskell version by over two orders of magnitude. Despite extensive study of the Haskell version, no clear reason has emerged for this poor performance."

note that they do not say "any Haskell version would have to be slow because 
of..",
they say they don't know.

well, enough of this fud!-)
i hope it is fair to say that not just todays compilers are different, but also 
the user
communities. i cannot imagine anyone experiencing this level of performance
difference in a concrete application without asking questions about it on one of the Haskell lists, nor would i expect such questions to remain unanswered for long enough to merit a paper (at least not about the question; Data.ByteString
shows that there are still papers to be written about answers;-). and if it is 
too
complicated for a public discussion, the project in question could always hire
Haskell expertise, in the form of consulting (although i see that most entries 
have
disappeared from the Haskell consultants list, probably because they tend to
answer questions for free on the mailing lists?-).

claus

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Reply via email to