Basically, I have to upgrade a website with a lot of new content. I received those docs in the openoffice format. If I open and save one of those documents in the html format, I can cut and paste the result in the html page, it's not that bad as a start but I need to clean that html (remove tags, remove or change attributes, ...). My first idea is to use lxml for that. My questions: - is there a better way ? - is lxml the right tool for that ? - some examples of code for doing that ?
Have a nice day. -- http://mail.python.org/mailman/listinfo/python-list