一首诗 wrote: > Oh, I didn't make myself clear. > > What I mean is how to convert a piece of html to plain text bu keep as > much format as possible. > > Such as convert " " to blank space and convert <br> to "\r\n" > > Gary Herron wrote: > >> 一首诗 wrote: >> >>> Is there any simple way to solve this problem? >>> >>> >>> >> Yes, strings have a replace method: >> >> >>>>> s = "abc def" >>>>> s.replace(' ',' ') >>>>> >> 'abc def' >> >> Also various modules that are meant to deal with web and xml and such >> have functions to do such operations. >> >> >> Gary Herron >> > > >>> my_translations = ''' " = " # "<br>=\r\n" "<BR>=\r\n" # Windows "<br>=\n" "<BR>=\n" # Linux # Add others to your heart's content '''
>>> import SE # From http://cheeseshop.python.org/pypi/SE/2.2%20beta >>> My_Translator = SE.SE (my_translations) >>> print My_Translator ('ABC DEFG<br>XYZ') ABC DEFG XYZ SE can also strip tags and translate all HTM escapes and generally lets you do ad hoc translations in seconds. You just write them up, make an SE object from your text an run your data through it. As simple as that. If you wish further explanations, I'll be happy to explain. Frederic -- http://mail.python.org/mailman/listinfo/python-list