[ https://issues.apache.org/jira/browse/TIKA-1351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14058644#comment-14058644 ]
Hudson commented on TIKA-1351: ------------------------------ SUCCESS: Integrated in tika-trunk-jdk1.7 #89 (See [https://builds.apache.org/job/tika-trunk-jdk1.7/89/]) [TIKA-1351] Updating AutoDetect, Composite and PDF parsers to guard against null content handlers (sergeyb: http://svn.apache.org/viewvc/tika/trunk/?view=rev&rev=1609677) * /tika/trunk/tika-core/src/main/java/org/apache/tika/parser/AutoDetectParser.java * /tika/trunk/tika-core/src/main/java/org/apache/tika/parser/CompositeParser.java * /tika/trunk/tika-parsers/src/main/java/org/apache/tika/parser/pdf/PDFParser.java * /tika/trunk/tika-parsers/src/test/java/org/apache/tika/parser/pdf/PDFParserTest.java > Parser implementations should accept null content handlers > ---------------------------------------------------------- > > Key: TIKA-1351 > URL: https://issues.apache.org/jira/browse/TIKA-1351 > Project: Tika > Issue Type: Improvement > Components: parser > Reporter: Sergey Beryozkin > Priority: Minor > > Applications which want to let users search documents based only on their > metadata do not need to get the content parsed. > The only workaround I've found so far is to pass a no op content handler > which can ignore the content events but it does not stop the parser such as > PDFParser from parsing the content. > Proposal: update parser API docs to let implementers know ContentHandler can > be null and update the shipped implementations to parse the metadata only if > ContentHandler is null -- This message was sent by Atlassian JIRA (v6.2#6252)