> Is there some reason you really want to convert to PDF first? You can
> get much better HTML right from the Word doc. You'll lose a lot of info
> going from PDF to HTML.
Right now, two reasons: Printing to PDF allows me to create the PDF "for
the web" which means it has a much smaller filesize n
I need to take a bunch of .doc files (word 2000) which have a little text
including some tables/layout and mostly pictures and comvert them to a pdf and
extract the text and images separately too. If I have a pdf, I can do create
the html with pdftohtml called from python with popen. However I n