[ https://issues.apache.org/jira/browse/TIKA-3629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17462833#comment-17462833 ]
David Pilato commented on TIKA-3629: ------------------------------------ I'm not sure I got it. Is {{Office.KEYWORDS}} supposed to give access to pdf keywords? It's not the case anymore. > Keywords are not extracted anymore from PDF documents > ----------------------------------------------------- > > Key: TIKA-3629 > URL: https://issues.apache.org/jira/browse/TIKA-3629 > Project: Tika > Issue Type: Bug > Components: core > Affects Versions: 2.2.0 > Reporter: David Pilato > Priority: Major > > Hey > > I'm seeing some changes (regressions?) in [Tika 2.2.0 (from > 2.1.0)|https://github.com/dadoonet/fscrawler/pull/1330]. > When extracting content from Office files (docs, doc, rtf), {{cp:subject}} is > not generated anymore. I'm not using this value anyway so that's may be not > an issue at all but a feature ;) > > But, for PDF documents, I'm not able to get anymore the keywords for the > document. > I was reading the keywords with {{Office.KEYWORDS}} but it's now null and I > don't see this change documented in the wiki. > > Is that expected or a bug? > -- This message was sent by Atlassian Jira (v8.20.1#820001)