Quoth Chad Perrin on Monday, 06 September 2010:
> On Sun, Sep 05, 2010 at 10:31:54AM +0200, Erik Trulsson wrote:
> > On Sun, Sep 05, 2010 at 08:57:11AM +0200, Roland Smith wrote:
> > > On Sat, Sep 04, 2010 at 05:09:20PM -0600, Chad Perrin wrote:
> > > > What PDF to HTML translators, other than pdftohtml, am I likely to be
> > > > able to find in ports?  I went looking for pdf2html, expecting to find
> > > > that there, but no luck.  Before I spend hours sifting through, still
> > > > without knowing whether I missed something that should be obvious, 
> > > 
> > > Yes, you did. :-)
> 
> Apparently not.  See below.
> 
> 
> > > 
> > > > I
> > > > figured I'd ask here whether anyone knows of something off the top of
> > > > his/her head.
> > > 
> > > Try textproc/pdftohtml 
> > 
> > Uhm, he said "other than pdftohtml" so I suspect he already knew about
> > that one.
> 
> This is indeed the case.
> 
> I appreciate the several suggestions I've received, though I see in
> retrospect that I haven't been sufficiently specific, since I have not
> gotten any suitable answers.
> 
> I have "inherited" a Perl script that wraps pdftohtml.  The reason a
> wrapper is needed is that a substantial amount of cleanup work is needed
> to produce HTML suitable to our final needs.  The output of pdftohtml is
> sufficiently far from "perfect" that I would like to test the output of a
> few other possible "back ends" for the script to see if a significant
> amount of work being done by the script can be eliminated.
> 
> Toward that end, the simpler the tool the better -- and the tool on the
> "back end" should not be something that must be contacted across a
> network, or that cannot be redistributed freely.  I wanted to start with
> things I have in the base system on my FreeBSD laptop (where I'm doing my
> development) or through ports.  OpenOffice.org is quite a bit larger and
> more unwieldy than we would really want to deal with at this point.
> Using Google or Adobe tools online is well outside the range of what we
> need (requiring network access for the tool to work).
> 
> I've started looking at the Xpdf tools as well as pdftohtml.  Other
> suggestions from within ports would be appreciated.  Additional options
> other than what can be found in ports might also be useful, understanding
> the needs I sketched out above.  The script itself is Perl, in case that
> matters.
> 
> To everyone who has replied so far: thank you for your time.
> 
> -- 
> Chad Perrin [ original content licensed OWL: http://owl.apotheon.org ]

How about print/p5-PDFLib and print/pecl-pdflib to roll your own?  Maybe
that's more work than you wanted.

-- 
Sterling (Chip) Camden    | sterl...@camdensoftware.com | 2048D/3A978E4F
http://camdensoftware.com | http://chipstips.com        | http://chipsquips.com

Attachment: pgpBoOjYgQ0uf.pgp
Description: PGP signature

Reply via email to