Hello. If there are any developers reading this, I'd like to make a
friendly suggestion for htdig for Etch. I think it would be a good idea
to include a note about external parsers in the htdig.conf file (which
did exist in the Sarge version of htdig).
I spent a few good hours, messing around with the file doc2html.pl
(which I found in the examples section of the included htdig
documentation). Further, someone on the htdig mailing list suggested
this file. Needless to say, no matter what I did, the file did not
work. I then chanced upon the parse_doc.pl file, and got parsing to
work by adding the following to htdig.conf:
external_parsers: application/pdf->text/html
/usr/share/htdig/parse_doc.pl \
application/msword->text/html /usr/share/htdig/parse_doc.pl
It would be nice if this was already included in the htdig.conf file,
perhaps commented out, giving me the choice to activate it. Perhaps
with a little note about installing xpdf-utils, and/or acroread, and
installing catdoc, to make it work. That way, others can avoid losing
some precious time in setting up their search engine to parse pdf
documents.
It would also be a good idea to have the accompanying documentation
reflect the usage of the parse_doc.pl file, instead of providing
examples of stuff that clearly does not work.
Thanks for the great work on Debian.
Mark
--
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]