On Jan 11, 1:15 pm, "Diez B. Roggisch" <de...@nospam.web.de> wrote: > Jeremy schrieb: > > > > > > > On Jan 11, 12:54 pm, Carl Banks <pavlovevide...@gmail.com> wrote: > >> On Jan 11, 11:20 am, Jeremy <jlcon...@gmail.com> wrote: > > >>> I just profiled one of my Python scripts and discovered that >99% of > >>> the time was spent in > >>> {built-in method sub} > >>> What is this function and is there a way to optimize it? > >> I'm guessing this is re.sub (or, more likely, a method sub of an > >> internal object that is called by re.sub). > > >> If all your script does is to make a bunch of regexp substitutions, > >> then spending 99% of the time in this function might be reasonable. > >> Optimize your regexps to improve performance. (We can help you if you > >> care to share any.) > > >> If my guess is wrong, you'll have to be more specific about what your > >> sctipt does, and maybe share the profile printout or something. > > >> Carl Banks > > > Your guess is correct. I had forgotten that I was using that > > function. > > > I am using the re.sub command to remove trailing whitespace from lines > > in a text file. The commands I use are copied below. If you have any > > suggestions on how they could be improved, I would love to know. > > > Thanks, > > Jeremy > > > lines = self._outfile.readlines() > > self._outfile.close() > > > line = string.join(lines) > > > if self.removeWS: > > # Remove trailing white space on each line > > trailingPattern = '(\S*)\ +?\n' > > line = re.sub(trailingPattern, '\\1\n', line) > > line = line.rstrip()? > > Diez
Yep. I was trying to reinvent the wheel. I just remove the trailing whitespace before joining the lines. Thanks, Jeremy -- http://mail.python.org/mailman/listinfo/python-list