On Sat, 08 May 2010 10:19:16 -0700, dasacc22 wrote: > Hi > > This is a simple question. I'm looking for the fastest way to calculate > the leading whitespace (as a string, ie ' ').
Is calculating the amount of leading whitespace really the bottleneck in your application? If not, then trying to shave off microseconds from something which is a trivial part of your app is almost certainly a waste of your time. [...] > a = ' some content\n' > b = a.strip() > c = ' '*(len(a)-len(b)) I take it that you haven't actually tested this code for correctness, because it's buggy. Let's test it: >>> leading_whitespace = " "*2 + "\t"*2 >>> a = leading_whitespace + "some non-whitespace text\n" >>> b = a.strip() >>> c = " "*(len(a)-len(b)) >>> assert c == leading_whitespace Traceback (most recent call last): File "<stdin>", line 1, in <module> AssertionError Not only doesn't it get the whitespace right, but it doesn't even get the *amount* of whitespace right: >>> assert len(c) == len(leading_whitespace) Traceback (most recent call last): File "<stdin>", line 1, in <module> AssertionError It doesn't even work correctly if you limit "whitespace" to mean spaces and nothing else! It's simply wrong in every possible way. This is why people say that premature optimization is the root of all (programming) evil. Instead of wasting time and energy trying to optimise code, you should make it correct first. Your solutions 2 and 3 are also buggy. And solution 3 can be easily re- written to be more straightforward. Instead of the complicated: > def get_leading_whitespace(s): > def _get(): > for x in s: > if x != ' ': > break > yield x > return ''.join(_get()) try this version: def get_leading_whitespace(s): accumulator = [] for c in s: if c in ' \t\v\f\r\n': accumulator.append(c) else: break return ''.join(accumulator) Once you're sure this is correct, then you can optimise it: def get_leading_whitespace(s): t = s.lstrip() return s[:len(s)-len(t)] >>> c = get_leading_whitespace(a) >>> assert c == leading_whitespace >>> Unless your strings are very large, this is likely to be faster than any other pure-Python solution you can come up with. -- Steven -- http://mail.python.org/mailman/listinfo/python-list