Mike Meyer <[EMAIL PROTECTED]> wrote: ... > > Of course, these results only apply where the "complexity" (e.g., number > > of operators, for example) in a single line of code is constant. > > I'm not sure what you're trying to say here. The tests ranged over > things from PL/I to assembler. Are you saying that those two languages > have the same "complexity in a single line"?
Not necessarily, since PL/I, for example, is quite capable of usages at extremes of operator density per line. So, it doesn't even have "the same complexity as itself", if used in widely different layout styles. If the studies imply otherwise, then I'm reminded of the fact that both Galileo and Newton published alleged experimental data which can be shown to be "too good to be true" (fits the theories too well, according to chi-square tests etc)... > > for item in sequence: blaap(item) > > > > or > > > > for item in sequence: > > blaap(item) > > > > are EXACTLY as easy (or hard) to write, maintain, and document -- it's > > totally irrelevant that the number of lines of code has "doubled" in the > > second (more standard) layout of the code!-) > > The studies didn't deal with maintenance. They only dealt with > documentation in so far as code was commented. > > On the other hand, studies of reading comprehension have shown that > people can read and comprehend faster if the line lengths fall within > certain ranges. While it's a stretch to assume those studies apply to > code, I'd personally be hesitant to assume they don't apply without > some reseach. If they do apply, then your claims about the difficulty > of maintaining and documenting being independent of the textual line > lengths are wrong. And since writing code inevitable involves > debugging it - and the studies specified debugged lines - then the > line length could affect how hard the code is to write as well. If time to code depends on textual line lengths, then it cannot solely depend on number of lines at the same time. If, as you say, the studies "prove" that speed of delivering debugged code depends strictly on the LOCs in the delivered code, then those studies would also be showing that the textual length of the lines is irrelevant to that speed (since, depending on coding styles, in most languages one can trade off textually longer lines for fewer lines). OTOH, the following "mental experiment" shows that the purported deterministic connection of coding time to LOC can't really hold: say that two programmers, Able and Baker, are given exactly the same task to accomplish in (say) language C, and end up with exactly the same correct source code for the resulting function; Baker, being a honest toiling artisan, codes and debugs his code in "expansive" style, with lots of line breaks (as lots of programming shops practice), so, given the final code looks like: while (foo()) { bar(); baz(); } (etc), he's coding 5 lines for each such loop; Able, being able, codes and debugs extremely crammed code, so the same final code looks, when Able is working on it, like: while (foo()) { bar(); baz(); } so, Able is coding 1 line for each such loop, 5 times less than Baker (thus, by hypothesis, Able must be done 5 times faster); when Able's done coding and debugging, he runs a "code beautifier" utility which runs in negligible time (compared to the time it takes to code and debug the program) and easily produces the same "expansively" laid-out code as Baker worked with all the time. So, Able is 5 times faster than Baker yet delivers identical final code, based, please note, not on any substantial difference in skill, but strictly on a trivial trick hinging on a popular and widely used kind of code-reformatting utility. Real-life observation suggests that working with extremely crammed code (to minimize number of lines) and beautifying it at the end is in fact not a sensible coding strategy and cannot deliver such huge increases in coding (and debugging) speed. Thus, either those studies or your reading of them must be fatally flawed in this respect (most likely, some "putting hands forward" footnote about coding styles and tools in use was omitted from the summaries, or neglected in the reading). Such misunderstandings have seriously damaged the practice of programming (and managements of programming) in the past. For example, shops evaluating coders' productivity in terms of lines of code have convinced their coders to distort their style to emit more lines of code in order to be measured as more productive -- it's generally trivial to do so, of course, in many cases, e.g. for i in range(100): a[i] = i*i can easily become 100 lines "a[0] = 0" and so on (easily produced by copy and paste or editor macros, or other similarly trivial means). At the other extreme, some coders (particularly in languages suitable for extreme density, such as Perl) delight in producing "one-liner" (unreadable) ``very clever'' equivalents of straightforward loops that would take up a few lines if written in the obvious way instead. The textual measure of lines of code is extremely easy to obtain, and pretty easy to adjust to account for some obvious first-order effects (e.g., ignoring comments and whitespace, counting logical lines rather than physical ones, etc), and that, no doubt, accounts for its undying popularity -- but it IS really a terrible measurement for "actual program size and complexity". Moreover, even if you normalized "program size" by suitable language specific factors (number of operators, decision points, cyclomatic complexity, etc), the correlation between program size and time to code it would still only hold within broadly defined areas, not across the board. I believe "The mythical man-month" was the first widely read work to point out how much harder it is to debug programs that use unrestrained concurrency (in today's terms, think of multithreading without any of the modern theory and helpers for it), which Brooks called "system programs", compared to "ordinary" sequential code (which Brooks called "application programs" -- the terminology is quite dated, but the deep distinction remains valid). Also: one huge monolithic program using global variables for everything is going to have complexity (and time to delivery of debugged code) that grows way more than linearly with program size; to keep a relation that's close to linear (though in no case can exact linearity be repeatably achieved for sufficiently large programming systems, I fear), we employ a huge variety of techniques to make our software systems more modular. It IS important to realize that higher level languages, by making programs of equivalent functionality (and with comparable intrinsic difficulty, modularity, etc) "textually smaller" (and thereby "conceptually" smaller), raises program productivity. But using "lines of code", without all the appropriate qualifications, for these measurements, is not appropriate. Even the definition of a language's level in terms of LOCs per function point is too "rough and ready" and thus suffers from this issue (function points as a language-independent measure of a coding task's "size" have their own issues, but much smaller ones than LOCs as a measure of a delivered code's size). Consider the analogy of measuring a writing task (in, say, English) by number of delivered words -- a very popular measure, too. No doubt, all other things being equal, it may take a writer about twice as much to deliver 2000 copy-edited words than to deliver 1000. But... all other things are rarely equal. To me, for example, it often comes most natural to take about 500 words to explain and illustrate one concept; but when I need to be concise, I will then take a lot of time to edit and re-edit that text until just about all of the same issues are put across in 200 words or less. It may take me twice as long to rewrite the original 500 words into 200, as it took to put down the 500 words in the first place -- which helps explains why many of my posts are so long, as I don't take all the time to re-edit them, and why it taks so long to write a "Nutshell" series book, where conciseness is crucial. Nor it is only my own issue... remember Pascal's "Lettres Provinciales", and the famous apology about "I am sorry that this letter is so long, but I did not have the time to write a shorter one"!-) Alex -- http://mail.python.org/mailman/listinfo/python-list