Stephen H created TIKA-4461: ------------------------------- Summary: In RFC822Parser support Content-Id for parts Key: TIKA-4461 URL: https://issues.apache.org/jira/browse/TIKA-4461 Project: Tika Issue Type: Improvement Components: parser Affects Versions: 3.2.1 Reporter: Stephen H Attachments: mail-parser-patch.txt
Currently RFC822Parser won't store in the metadata for a message part the part's Content-Id field. This means it's not possible to relate a cid: URL in message HTML to the part that has that content. The attached patch adds a new MESSAGE_CONTENT_ID property to the Message properties and then MailContentHandler adds this from the part field when it is present, normally just for inline parts. Modified an existing test to check for this which is hopefully okay. -- This message was sent by Atlassian Jira (v8.20.10#820010)