Hello, I´m trying to establish a document server with htdig under SuSE-8.2. In this context I also tried to build an index of pdf-files, created with LyX, with the htdig external_parsers method. This works for all my pdf-files, except the ones from LyX-sources.
After a long time of debugging in the following files: genhtdig.pl -> htdig -> doc2html.pl -> pdf2html.pl -> pdftotext In the end I found the following in the MAN-page of pdftotext: BUGS Some PDF files contain fonts whose encodings have been mangled beyond recognition. There is no way (short of OCR) to extract text from these files. Question: is that really the point, why pdftotext fails in processing lyx-pdf files? And if so, is there another way, in indexing lyx-pdf files? Thank you bernhard -- http://home.t-online.de/home/mb.schiekel/ GPG-Key available: GnuPG-1.2.2