I've done some work on a PDF-to-HTML converter. It's not really finished yet, and I haven't made any decisions on how to release it, but it will include hyperlinks.
- Derek On 2009 Jun 30, jida...@jidanni.org wrote: > Package: poppler-utils > Version: 0.10.6-1 > Severity: wishlist > File: /usr/bin/pdftotext > X-debbugs-Cc: der...@foolabs.com > > pdftotext has no option to copy link locations from the document into > the output stream. > > E.g, here are 50 or so email addresses that one has to dig straight out > of the PDF, as one never would get to see them in any pdftotext output, > > $ GET http://portal.gsdi.org/files/?artifact_id=448 | > perl -nwle 'print for /\...@\w+\.[\w.]+/g'|wc -l > 51 > > Those were the "/Subtype/Link/A<</URI(mailto:" mailto links. I bet > pdftotext does no better for http links. > > How to print them, e.g., > Nordsburg[http://nordsburg.net/] > Nordsburg(http://nordsburg.net/) > Nordsburg (http://nordsburg.net/) > etc. is up to you. > > -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org