Re: Need a Regular expression to remove a char for Unicode text

Leo Kislov Fri, 13 Oct 2006 05:15:58 -0700

On Oct 13, 4:55 am, "Leo Kislov" <[EMAIL PROTECTED]> wrote:
> On Oct 13, 4:44 am, [EMAIL PROTECTED] wrote:
>
> > శ్రీనివాస wrote:
> > > Hai friends,
> > > Can any one tell me how can i remove a character from a unocode text.
> > > కల్‌&హార is a Telugu word in Unicode. Here i want to
> > > remove '&' but not replace with a zero width char. And one more thing,
> > > if any whitespaces are there before and after '&' char, the text should
> > > be kept as it is. Please tell me how can i workout this with regular
> > > expressions.
>
> > > Thanks and regards
> > > Srinivasa Raju DatlaDon't know anything about Telugu, but is this the 
> > > approach you want?
>
> > >>> x=u'\xfe\xff & \xfe\xff \xfe\xff&\xfe\xff'
> > >>> noampre = re.compile('(?<!\s)&(?!\s)', re.UNICODE).sub
> > >>> noampre('', x)


> He wants to replace & with zero width joiner so the last call should be
> noampre(u"\u200D", x)

Pardon my poor reading comprehension, OP doesn't want zero width
joiner. Though I'm confused why he mentioned it at all.

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Need a Regular expression to remove a char for Unicode text

Reply via email to