New submission from Rick Otten:

The documentation states that "|" parsing goes from left to right.  This 
doesn't seem to be true when spaces are involved.  (or \s).

Example:

In [40]: mystring
Out[40]: 'rwo incorporated'

In [41]: re.sub('incorporated| inc|llc|corporation|corp| co', '', mystring)
Out[41]: 'rwoorporated'

In this case " inc" was processed before incorporated.
If I take the space out:

In [42]: re.sub('incorporated|inc|llc|corporation|corp| co', '', mystring)
Out[42]: 'rwo '

incorporated is processed first.

If I put a space with each, then " incorporated" is processed first:

In [43]: re.sub(' incorporated| inc|llc|corporation|corp| co', '', mystring)
Out[43]: 'rwo'

And If use \s instead of a space, it is processed first:

In [44]: re.sub('incorporated|\sinc|llc|corporation|corp| co', '', mystring)
Out[44]: 'rwoorporated'

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue23532>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to