[ https://issues.apache.org/jira/browse/TIKA-2227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15771026#comment-15771026 ]
David Pilato commented on TIKA-2227: ------------------------------------ Sorry. Answer is {{TikaCoreProperties.KEYWORDS}}. Don't know I missed it... :) > Replacement of MSOffice#KEYWORDS for RTF and ODT docs > ----------------------------------------------------- > > Key: TIKA-2227 > URL: https://issues.apache.org/jira/browse/TIKA-2227 > Project: Tika > Issue Type: Bug > Components: parser > Affects Versions: 1.14 > Reporter: David Pilato > Priority: Minor > > I'm trying to extract metadata from different type of documents. > I'm using for that {{metadata.get(MSOffice.KEYWORDS)}} but it's marked as > {{Deprecated}} by {{Office}} class. > So I changed my code to use now {{metadata.get(Office.KEYWORDS)}} instead. > It does not work for 2 types of docs: > * RTF: > https://github.com/dadoonet/fscrawler/blob/master/src/test/resources/documents/test.rtf > * ODT: > https://github.com/dadoonet/fscrawler/blob/master/src/test/resources/documents/test.odt > It seems that RTF and ODT keywords are extracted to a {{"Keyword"}} metadata > name although they should probably be generated to {{"meta:keyword"}}. > You can reuse if needed the documents I linked to here as test case if needed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)