On Oct 13, 4:55 am, "Leo Kislov" <[EMAIL PROTECTED]> wrote: > On Oct 13, 4:44 am, [EMAIL PROTECTED] wrote: > > > శ్రీనివాస wrote: > > > Hai friends, > > > Can any one tell me how can i remove a character from a unocode text. > > > కల్&హార is a Telugu word in Unicode. Here i want to > > > remove '&' but not replace with a zero width char. And one more thing, > > > if any whitespaces are there before and after '&' char, the text should > > > be kept as it is. Please tell me how can i workout this with regular > > > expressions. > > > > Thanks and regards > > > Srinivasa Raju DatlaDon't know anything about Telugu, but is this the > > > approach you want? > > > >>> x=u'\xfe\xff & \xfe\xff \xfe\xff&\xfe\xff' > > >>> noampre = re.compile('(?<!\s)&(?!\s)', re.UNICODE).sub > > >>> noampre('', x)
> He wants to replace & with zero width joiner so the last call should be > noampre(u"\u200D", x) Pardon my poor reading comprehension, OP doesn't want zero width joiner. Though I'm confused why he mentioned it at all. -- http://mail.python.org/mailman/listinfo/python-list