Re: Html or Pdf to Rtf (Linux) with Python

Mike Meyer Thu, 16 Dec 2004 10:05:08 -0800

Axel Straschil <[EMAIL PROTECTED]> writes:

> Hallo!
>
>> However, our company's product, PDFTextStream does do a phenomenal
>> job of extracting text and metadata out of PDF documents.  It's
>> crazy-fast, has a clean API, and in general gets the job done very
>> nicely.  It presents two points of compromise from your idea
>> situation:
>> 1. It only produces text, so you would have to take the text it
>> provides and write it out as an RTF yourself (there are tons of
>> packages and tools that do this).  Since the RTF format has pretty
>> weak formatting capabilities compared
>
> I've got the Input Source in HTML, the Problem ist converting from any
> to RTF. Please give me a hint where the tons of packages are.


That's easy. Load the HTML in MS Word, and save it as RTF. Script it
via COM using the python win32all (I think that's what it's now
called) package.

        <mike
-- 
Mike Meyer <[EMAIL PROTECTED]>                  http://www.mired.org/home/mwm/
Independent WWW/Perforce/FreeBSD/Unix consultant, email for more information.
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Html or Pdf to Rtf (Linux) with Python

Reply via email to