Tim Allison created TIKA-4646:
---------------------------------
Summary: Extract links from instrText in ooxml
Key: TIKA-4646
URL: https://issues.apache.org/jira/browse/TIKA-4646
Project: Tika
Issue Type: Task
Reporter: Tim Allison
In phishing ooxml docs, a user might see one link, but the actual underlying
hyperlink is entirely different. One way this can happen is that the link is
hidden in the instrText field.
We should pull out these links.
Thanks to Mike Flester on the user list for sharing a triggering file and for
raising this issue.
https://lists.apache.org/thread/r85h0wrqknszxhklmy9q5o4wzb8gwgcg
--
This message was sent by Atlassian Jira
(v8.20.10#820010)