On Sun, 2007-07-22 at 22:33 +0200, Peter Kleiweg wrote:
> >>> import re
> >>> s = u'a b\u00A0c d'
> >>> s.split()
> [u'a', u'b', u'c', u'd']
> >>> re.findall(r'\S+', s)
> [u'a', u'b\xa0c', u'd']
>
If you want the Unicode interpretation of \S+, etc, you pass the
re.UNICODE fl
On Sun, 2007-07-22 at 22:33 +0200, Peter Kleiweg wrote:
> >>> import re
> >>> s = u'a b\u00A0c d'
> >>> s.split()
> [u'a', u'b', u'c', u'd']
> >>> re.findall(r'\S+', s)
> [u'a', u'b\xa0c', u'd']
And your question is...?
> This isn't documented either:
>
> >>> s = '
>>> import re
>>> s = u'a b\u00A0c d'
>>> s.split()
[u'a', u'b', u'c', u'd']
>>> re.findall(r'\S+', s)
[u'a', u'b\xa0c', u'd']
This isn't documented either:
>>> s = ' b c '
>>> s.split()
['b', 'c']
>>> s.split(' ')
['', 'b', 'c', '']
--
Peter