[ 
http://jira.dspace.org/jira/browse/DS-183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=11255#action_11255
 ] 

keith johnson commented on DS-183:
----------------------------------

xpdf-filters.html to install works in certain formats and not others.
Conversion from pdf to html can work but certain editing abilities get removed.
see http://www.uk-mobile-phone.com
The thumbnail image does not load properly in such 3D documents and renders
a red cross rather than the desired imagery.

> XPDF support for filtering PDFs for text extraction/search.
> -----------------------------------------------------------
>
>                 Key: DS-183
>                 URL: http://jira.dspace.org/jira/browse/DS-183
>             Project: DSpace 1.x
>          Issue Type: Improvement
>          Components: DSpace API
>    Affects Versions: 1.5.1, 1.5.2
>         Environment: Unix and Linux
>            Reporter: Mark Diggory
>            Assignee: Mark Diggory
>             Fix For: 1.5.2
>
>         Attachments: xpdf-filters.html, xpdf-filters.xml, XPDFFilters.patch
>
>
> See original description here...
> https://sourceforge.net/tracker/?func=detail&aid=2745393&group_id=19984&atid=319984
> Here are a pair of mediafilters to process PDF files with the
> XPDF suite (see http://www.foolabs.com/xpdf/ ) replacing the
> one based on PDFBox. They invoke an external command, which
> must be configured. It has been tested on Unix and the concept
> ought to work on Windows (and certainly on MacOS X).
> XPDF2Text is a replacement for the existing PDF media filter, it
> creates extracted text using the pdftotext program. I've observed it
> is about 3 times as fast, and much more reliable, than PDFBox.
> XPDF2Thumbnail creates a thumbnail image for the first page of
> the PDF. This is especially effective for 3D PDF renderings of
> engineering models, but works fine for any document.
> See the instructions in xpdf-filters.html to install it.
> The thumbnail filter needs an additional image library, but
> the text extractor doesn't need anything else.
> This code has been tested with DSpace 1.5.1

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://jira.dspace.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

------------------------------------------------------------------------------
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
Dspace-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-devel

Reply via email to