phasma wrote:
string = u"Привет"
(u'\u041f\u0440\u0438\u0432\u0435\u0442',)
string = u"Hi.Привет"
(u'Hi',)
the [\w\s] pattern you used matches letters, numbers, underscore, and
whitespace. "." doesn't fall into that category, so the "match" method
stops when it gets to that character.
ma
On Sep 5, 12:28 pm, phasma <[EMAIL PROTECTED]> wrote:
> string = u"ðÒÉ×ÅÔ"
All the characters are letters.
> (u'\u041f\u0440\u0438\u0432\u0435\u0442',)
>
> string = u"Hi.ðÒÉ×ÅÔ"
The third character isn't a letter and isn't whitespace.
> (u'Hi',)
>
> On Sep 4, 9:53špm, Fredrik Lundh <[EMAIL PRO
string = u"Привет"
(u'\u041f\u0440\u0438\u0432\u0435\u0442',)
string = u"Hi.Привет"
(u'Hi',)
On Sep 4, 9:53 pm, Fredrik Lundh <[EMAIL PROTECTED]> wrote:
> phasma wrote:
> > Hi, I'm trying extract all alphabetic characters from string.
>
> > reg = re.compile('(?u)([\w\s]+)', re.UNICODE)
> > buf =
phasma wrote:
Hi, I'm trying extract all alphabetic characters from string.
reg = re.compile('(?u)([\w\s]+)', re.UNICODE)
buf = re.match(string)
But it's doesn't work. If string starts from Cyrillic character, all
works fine. But if string starts from Latin character, match returns
only Latin
On Sep 4, 3:42 pm, phasma <[EMAIL PROTECTED]> wrote:
> Hi, I'm trying extract all alphabetic characters from string.
>
> reg = re.compile('(?u)([\w\s]+)', re.UNICODE)
You don't need both (?u) and re.UNICODE: they mean the same thing.
This will actually match letters and whitespace.
> buf = re.ma
Hi, I'm trying extract all alphabetic characters from string.
reg = re.compile('(?u)([\w\s]+)', re.UNICODE)
buf = re.match(string)
But it's doesn't work. If string starts from Cyrillic character, all
works fine. But if string starts from Latin character, match returns
only Latin characters.
Plea