I've done some work on a PDF-to-HTML converter.  It's not really
finished yet, and I haven't made any decisions on how to release it, but
it will include hyperlinks.

- Derek


On 2009 Jun 30, jida...@jidanni.org wrote:
> Package: poppler-utils
> Version: 0.10.6-1
> Severity: wishlist
> File: /usr/bin/pdftotext
> X-debbugs-Cc: der...@foolabs.com
> 
> pdftotext has no option to copy link locations from the document into
> the output stream.
> 
> E.g, here are 50 or so email addresses that one has to dig straight out
> of the PDF, as one never would get to see them in any pdftotext output,
> 
> $ GET http://portal.gsdi.org/files/?artifact_id=448 |
> perl -nwle 'print for /\...@\w+\.[\w.]+/g'|wc -l
> 51
> 
> Those were the "/Subtype/Link/A<</URI(mailto:"; mailto links. I bet
> pdftotext does no better for http links.
> 
> How to print them, e.g.,
> Nordsburg[http://nordsburg.net/]
> Nordsburg(http://nordsburg.net/)
> Nordsburg (http://nordsburg.net/)
> etc. is up to you.
> 
> 




-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Reply via email to