Ewan Mellor created TIKA-2586: --------------------------------- Summary: PDFParser documentation has incorrect DPI default Key: TIKA-2586 URL: https://issues.apache.org/jira/browse/TIKA-2586 Project: Tika Issue Type: Improvement Components: documentation Reporter: Ewan Mellor
On [https://wiki.apache.org/tika/PDFParser%20%28Apache%20PDFBox%29] it says: {quote}This method of OCR is triggered by the ocrStrategy parameter, but users can manipulate other parameters, including the image type (see org.apache.pdfbox.rendering.ImageType for options) and the dots per inch dpi. The defaults are: gray and 200 respectively. {quote} The stated DPI default here is incorrect. In both tika/tika-parsers/src/main/resources/org/apache/tika/parser/pdf/PDFParser.properties and tika/tika-parsers/src/main/java/org/apache/tika/parser/pdf/PDFParserConfig.java the ocrDPI value is set to 300. This is an immutable wiki page (at least to me) so I can't change it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)