MimeType.getDescription() often returns nothing when "tika-mimetypes.xml" has a useful description already available. ---------------------------------------------------------------------------------------------------------------------
Key: TIKA-515 URL: https://issues.apache.org/jira/browse/TIKA-515 Project: Tika Issue Type: Bug Affects Versions: 0.7 Environment: ALL Reporter: Miroslav Pokorny The reader that reads the XML and builds MimeTypes seem to be hard coded to the read only a single element when another is often used to hold descriptions/comments.. String COMMENT_TAG = "_comment"; MimeTypesReader.readMimeType(Element element) throws MimeTypeException { if (nodeElement.getTagName().equals(COMMENT_TAG)) { type.setDescription( nodeElement.getFirstChild().getNodeValue()); --xml sample #1-- <mime-type type="application/msword"> <alias type="application/vnd.ms-word"/> <comment>Microsoft Word Document</comment> notice "comment" not "_comment' element... Why not simply rename all "_comment" tags to _comment and update the constant and all will be well. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.