Ben Finney <[EMAIL PROTECTED]> writes: > This seems to make the dangerous assumption that the programmer has > the correct program in mind, and needs only to transfer it correctly > to the computer.
Well, I hope the programmer can at least state some clear specficiations that a correct program should implement, e.g. it should not freeze or crash. > I would warrant that anyone who understands exactly how a program > should work before writing it, and makes no design mistakes before > coming up with a program that works, is not designing a program of any > interest. I'm not sure what the relevance of this is. Programmers generally have an idea of what it is that they're trying to write. They then have to make a bunch of design and implementation decisions as they go along. Soemtimes those decisions are subtly incorrect, and the more errors the computer can identify automatically, the fewer errors are likely to remain in the program. > Certainly. Which is why the trend continues toward developing programs > such that mistakes of all kinds cause early, obvious failure -- where > they have a much better chance of being caught and fixed *before* > innocent hands get ahold of them. Here is what Edward Lee wrote about developing a multi-threaded Java app (http://www.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS-2006-1.pdf): A part of the Ptolemy Project experiment was to see whether effective software engineering practices could be developed for an academic research setting. We developed a process that included a code maturity rating system (with four levels, red, yellow, green, and blue), design reviews, code reviews, nightly builds, regression tests, and automated code coverage metrics [43]. The portion of the kernel that ensured a consistent view of the program structure was written in early 2000, design reviewed to yellow, and code reviewed to green. The reviewers included concurrency experts, not just inexperienced graduate students (Christopher Hylands (now Brooks), Bart Kienhuis, John Reekie, and myself were all reviewers). We wrote regression tests that achieved 100 percent code coverage. The nightly build and regression tests ran on a two processor SMP machine, which exhibited different thread behavior than the development machines, which all had a single processor. The Ptolemy II system itself began to be widely used, and every use of the system exercised this code. No problems were observed until the code deadlocked on April 26, 2004, four years later. It is certainly true that our relatively rigorous software engineering practice identified and fixed many concurrency bugs. But the fact that a problem as serious as a deadlock that locked up the system could go undetected for four years despite this practice is alarming. How many more such problems remain? How long do we need test before we can be sure to have discovered all such problems? Regrettably, I have to conclude that testing may never reveal all the problems in nontrivial Multithreaded code. Now consider that the code you test today on a 2-core processor may be running on a 100-core processor in 4 years, making any concurrency errors that much worse. You can't test on the 100-core cpu today because it doesn't exist yet, but your code still has to run on it correctly once it's out there. Heck, if you're writing in CPython (no parallelism at all, due to the GIL) you might be in for a surprise as soon as real parallelism arrives on even 2 cores. It gets worse, maybe you're writing operating system or language runtime code (e.g. parallel garbage collector for Java). That means there will be hostile code on the same machine, written by diabolical maniacs who know every detail of your implementation, and who deliberately do stuff (e.g. write code to hit specific cache lines or cause page faults at fiendishly precisely chosen moments) specifically to cause weird timing conditions and trigger bugs. Testing is not enough to ensure that such attacks won't succeed. I remember an old programming book (might even have been Knuth vol. 3) recommending using the Shell sorting algorithm (a very simple exchange sort that runs in about O(n**1.4) time) instead of an O(n log n) algorithm because Shell's algorithm (being simpler) was less likely to end up with implementation errors. Today I think an experienced programmer needing to code a sort routine wouldn't hesitate to code a good algorithm, because our languages and coding/testing methods are better than they were in the Fortran and assembly language era when that book was written. But there are still more complex problems (like shared memory multiprocessing) that we're advised to stay away from because they're so fraught with hazards. We instead use approaches that are simpler but take perforamnce penalties. (As a non-concurrency example, we might implement geometric search through some kind of list scanning algorithm instead of using more efficient but complicated k-d trees). With higher reliability languages we'll hopefully be in better shape to tackle those complex problems instead of retreating from them, without expecting to spend eternity hunting down mysterious bugs. -- http://mail.python.org/mailman/listinfo/python-list