Re: Matching horizontal white space

Ben Finney Sun, 14 Sep 2008 16:01:47 -0700

[EMAIL PROTECTED] writes:

> multipleSpaces = re.compile(u'\\h+')
> 
> importantTextString = '\n  \n  \n \t\t  '
> importantTextString = multipleSpaces.sub("M", importantTextString)


Please get into the habit of following the Python coding style guide
<URL:http://www.python.org/dev/peps/pep-0008>.

For literal strings that you expect to contain backslashes, it's often
clearer to use the "raw" string syntax:

    multiple_spaces = re.compile(ur'\h+')

> I would have expected consecutive spaces and tabs to be replaced by
> M

Why, what leads you to expect that? Your regular expression doesn't
specify spaces or tabs. It specifies "the character 'h', one or more
times".

For "space or tab", specify a character class of space and tab:

    >>> multiple_spaces = re.compile(u'[\t ]+')
    >>> important_text_string = u'\n  \n  \n \t\t  '
    >>> multiple_spaces.sub("M", important_text_string)
    u'\nM\nM\nM'


You probably want to read the documentation for the Python 're' module
<URL:http://www.python.org/doc/lib/module-re>. This is standard
practice when using any unfamiliar module from the standard library.

-- 
 \           “If you do not trust the source do not use this program.” |
  `\                                —Microsoft Vista security dialogue |
_o__)                                                                  |
Ben Finney
--
http://mail.python.org/mailman/listinfo/python-list

Re: Matching horizontal white space

Reply via email to