On May 8, 12:19 pm, dasacc22 <dasac...@gmail.com> wrote: > Hi > > This is a simple question. I'm looking for the fastest way to > calculate the leading whitespace (as a string, ie ' '). > > Here are some different methods I have tried so far > --- solution 1 > > a = ' some content\n' > b = a.strip() > c = ' '*(len(a)-len(b)) > > --- solution 2 > > a = ' some content\n' > b = a.strip() > c = a.partition(b[0])[0] > > --- solution 3 > > def get_leading_whitespace(s): > def _get(): > for x in s: > if x != ' ': > break > yield x > return ''.join(_get()) > > --- > > Solution 1 seems to be about as fast as solution 2 except in certain > circumstances where the value of b has already been determined for > other purposes. Solution 3 is slower due to the function overhead. > > Curious to see what other types of solutions people might have. > > Thanks, > Daniel
Well, you could try a solution using re, but that's probably only likely to be faster if you can use it on multiple concatenated lines. I usually use something like your solution #1. One thing to be aware of, though, is that strip() with no parameters will strip *any* whitespace, not just spaces, so the implicit assumption in your code that what you have stripped is spaces may not be justified (depending on the source data). OTOH, depending on how you use that whitespace information, it may not really matter. But if it does matter, you can use strip(' ') If speed is really an issue for you, you could also investigate mxtexttools, but, like re, it might perform better if the source consists of several batched lines. Regards, Pat -- http://mail.python.org/mailman/listinfo/python-list