Tim Hochberg <[EMAIL PROTECTED]> wrote: ... > These two would be easy to acomplish using something like: > > def countchars(text): > n = 0 > for line in text.split('\n'): > n += len(line.strip()) > return n > > This would ignore leading and trailing white space as well as blank lines.
However, it would still make a=23 "better" than a = 23 and there's really no reason for that. I would instead suggest using the tokenize module. > > - the length of names does not count, unless the code depends on it. > > Probably too hard. Ignoring the length of identifiers is easy if you're tokenizing anyway. Checking if "the code depends on" the exact spelling of its identifiers is a lot harder, admittedly -- you could try emitting a variant of the code where all identifiers which are not builtins are systematically replaced with 'x0001', 'x0002', etc, and verify that the variant still passes the test, but it's definitely a nontrivial one. I think we'll have to accept the fact that the "shortest program" will use one-letter identifiers for everything except builtins. > > Some harmonization procedure might be applied to every solution > > before counting lines, in order to avoid spectacular cryptic stuff. > > I thought the metric was characters, not lines. At least that's what the > 'about' page says. You still get hit by leading whitespace on multiple > line programs though. Definitely, characters. A high-granularity measure is essential to reduce the chance of ties. Even so there may well be equal-first-place winners -- hope they're not solved in terms of first submission, since submitting at 14:00 UTC is WAY easier for Europe residents (residents of the Americas would have to go to bed VERY late, get up VERY early, or spend extra effort setting up cron jobs), and that would bias everything in a most unfair manner. Alex -- http://mail.python.org/mailman/listinfo/python-list