Re: Identifying unicode punctuation characters with Python regex

2008-11-19 Thread jhermann
> >>> P=P.replace('\\','').replace(']','\\]')   # escape both of them. re.escape() does this w/o any assumptions by your code about the regex implementation. -- http://mail.python.org/mailman/listinfo/python-list

Re: Identifying unicode punctuation characters with Python regex

2008-11-14 Thread Shiao
On Nov 14, 12:30 pm, "Mark Tolonen" <[EMAIL PROTECTED]> wrote: > "Mark Tolonen" <[EMAIL PROTECTED]> wrote in message > > news:[EMAIL PROTECTED] > > > > > > > "Shiao" <[EMAIL PROTECTED]> wrote in message > >news:[EMAIL PROTECTED] > >> Hello, > >> I'm trying to build a regex in python to identify pun

Re: Identifying unicode punctuation characters with Python regex

2008-11-14 Thread Mark Tolonen
"Mark Tolonen" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] "Shiao" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] Hello, I'm trying to build a regex in python to identify punctuation characters in all the languages. Some regex implementations support an extended

Re: Identifying unicode punctuation characters with Python regex

2008-11-14 Thread Mark Tolonen
"Shiao" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] Hello, I'm trying to build a regex in python to identify punctuation characters in all the languages. Some regex implementations support an extended syntax \p{P} that does just that. As far as I know, python re doesn't. Any ide

Re: Identifying unicode punctuation characters with Python regex

2008-11-14 Thread Shiao
On Nov 14, 11:27 am, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote: > > I'm trying to build a regex in python to identify punctuation > > characters in all the languages. Some regex implementations support an > > extended syntax \p{P} that does just that. As far as I know, python re > > doesn't. Any

Re: Identifying unicode punctuation characters with Python regex

2008-11-14 Thread Martin v. Löwis
> I'm trying to build a regex in python to identify punctuation > characters in all the languages. Some regex implementations support an > extended syntax \p{P} that does just that. As far as I know, python re > doesn't. Any idea of a possible alternative? You should use character classes. You can

Identifying unicode punctuation characters with Python regex

2008-11-14 Thread Shiao
Hello, I'm trying to build a regex in python to identify punctuation characters in all the languages. Some regex implementations support an extended syntax \p{P} that does just that. As far as I know, python re doesn't. Any idea of a possible alternative? Apart from manually including the punctuat