On May 8, 12:59 pm, Patrick Maupin <pmau...@gmail.com> wrote: > On May 8, 12:19 pm, dasacc22 <dasac...@gmail.com> wrote: > > > > > > > Hi > > > This is a simple question. I'm looking for the fastest way to > > calculate the leading whitespace (as a string, ie ' '). > > > Here are some different methods I have tried so far > > --- solution 1 > > > a = ' some content\n' > > b = a.strip() > > c = ' '*(len(a)-len(b)) > > > --- solution 2 > > > a = ' some content\n' > > b = a.strip() > > c = a.partition(b[0])[0] > > > --- solution 3 > > > def get_leading_whitespace(s): > > def _get(): > > for x in s: > > if x != ' ': > > break > > yield x > > return ''.join(_get()) > > > --- > > > Solution 1 seems to be about as fast as solution 2 except in certain > > circumstances where the value of b has already been determined for > > other purposes. Solution 3 is slower due to the function overhead. > > > Curious to see what other types of solutions people might have. > > > Thanks, > > Daniel > > Well, you could try a solution using re, but that's probably only > likely to be faster if you can use it on multiple concatenated lines. > I usually use something like your solution #1. One thing to be aware > of, though, is that strip() with no parameters will strip *any* > whitespace, not just spaces, so the implicit assumption in your code > that what you have stripped is spaces may not be justified (depending > on the source data). OTOH, depending on how you use that whitespace > information, it may not really matter. But if it does matter, you can > use strip(' ') > > If speed is really an issue for you, you could also investigate > mxtexttools, but, like re, it might perform better if the source > consists of several batched lines. > > Regards, > Pat
Hi, thanks for the info. Using .strip() to remove all whitespace in solution 1 is a must. If you only stripped ' ' spaces then line endings would get counted in the len() call and when multiplied against ' ', would produce an inaccurate result. Regex is significantly slower for my purposes but ive never heard of mxtexttools. Even if it proves slow its spurred my curiousity as to what functionality it provides (on an unrelated note) -- http://mail.python.org/mailman/listinfo/python-list