Re: Clean "Durty" strings

2007-04-02 Thread rzed
"Diez B. Roggisch" <[EMAIL PROTECTED]> wrote in news:[EMAIL PROTECTED]: >> >> If the OP is constrained to standard libraries, then it may be >> a question of defining what should be done more clearly. The >> extraneous spaces can be removed by tokenizing the string and >> rejoining the tokens. R

Re: Clean "Durty" strings

2007-04-02 Thread irstas
On Apr 2, 10:08 pm, Michael Hoffman <[EMAIL PROTECTED]> wrote: > [EMAIL PROTECTED] wrote: > > But it could be that he just wants all HTML tags to disappear, like in > > his example. A code like this might be sufficient then: re.sub(r'<[^>] > > +>', '', s). > > Won't work for, say, this: > > > -- >

Re: Clean "Durty" strings

2007-04-02 Thread Michael Hoffman
[EMAIL PROTECTED] wrote: > But it could be that he just wants all HTML tags to disappear, like in > his example. A code like this might be sufficient then: re.sub(r'<[^>] > +>', '', s). Won't work for, say, this: -- Michael Hoffman -- http://mail.python.org/mailman/listinfo/python-list

Re: Clean "Durty" strings

2007-04-02 Thread Marc 'BlackJack' Rintsch
In <[EMAIL PROTECTED]>, irstas wrote: > I'd like to see how this transformation can be done with > BeautifulSoup. Well, the last two regexps can be replaced with this: > > unicode(BeautifulStoneSoup(s,convertEntities=BeautifulStoneSoup.HTML_ENTITIES).contents[0]) Completely without regular expre

Re: Clean "Durty" strings

2007-04-02 Thread irstas
On Apr 2, 4:05 pm, "Diez B. Roggisch" <[EMAIL PROTECTED]> wrote: > > If the OP is constrained to standard libraries, then it may be a > > question of defining what should be done more clearly. The extraneous > > spaces can be removed by tokenizing the string and rejoining the > > tokens. Replacing

Re: Clean "Durty" strings

2007-04-02 Thread Diez B. Roggisch
> > If the OP is constrained to standard libraries, then it may be a > question of defining what should be done more clearly. The extraneous > spaces can be removed by tokenizing the string and rejoining the > tokens. Replacing portions of a string with equivalents is standard > stuff. It might be

Re: Clean "Durty" strings

2007-04-02 Thread rzed
"Diez B. Roggisch" <[EMAIL PROTECTED]> wrote in news:[EMAIL PROTECTED]: > Ulysse wrote: > >> Hello, >> >> I need to clean the string like this : >> >> string = >> """ >> bonne mentalité mec!:) \nbon >> pour info moi je suis un serial posteur arceleur dictateur ^^* >> \n

Re: Clean "Durty" strings

2007-04-02 Thread Diez B. Roggisch
Ulysse wrote: > Hello, > > I need to clean the string like this : > > string = > """ > bonne mentalité mec!:) \nbon pour > info moi je suis un serial posteur arceleur dictateur ^^* > \nmais pour avoir des resultats probant il > faut pas faire les m

Clean "Durty" strings

2007-04-01 Thread Ulysse
Hello, I need to clean the string like this : string = """ bonne mentalité mec!:) \nbon pour info moi je suis un serial posteur arceleur dictateur ^^* \nmais pour avoir des resultats probant il faut pas faire les mariolles, comme le "fondateur" de b