Peter Maas wrote: > I think that a LOC comparison between a language that enforces line breaks > and another language that enables putting an lots of code in one line > doesn't make much sense. I wonder why comparisons aren't made in terms of > word count. Word count would include literals, constants, variables, > keywords, operators, bracket- and block delimiter pairs. Python > indent/unindent would of course also count as block delimiters. I think > this would be a more precise measure for software size.
I don't disagree, but "word" counts aren't so simple, either to define or to implement. What counts as a word? Parser tokens? That counts a.b (or a::b, or a->b, depending on language) as 3 words. Block delimiters? After a month, you don't even notice them in properly formatted code, which is why python doesn't have them in the first place. Operators? Then e.g. a = b + c + d + e counts more than a = add (b, c, d, e). The complexity of expressions seems determined by the numbers of operands; using operators as well arguably overcounts. Regardless of the above choices, you still need a parser (or at least a lexer) to count anything. Whitespace separation won't cut it - what happens with 'for (i=0;i<5;i++)' or 's = "foo bar baz"'? If you toss out operators, you could almost get away with regular expressions for counting the identifiers, keywords, and literals. But there's still the problem of overcounting string literals. Line counts are simple to compute and it's easier to agree on which lines to count. Thus their popularity. -- Edward Elliott UC Berkeley School of Law (Boalt Hall) complangpython at eddeye dot net -- http://mail.python.org/mailman/listinfo/python-list