Tim Allison created TIKA-4646:
---------------------------------

             Summary: Extract links from instrText in ooxml
                 Key: TIKA-4646
                 URL: https://issues.apache.org/jira/browse/TIKA-4646
             Project: Tika
          Issue Type: Task
            Reporter: Tim Allison


In phishing ooxml docs, a user might see one link, but the actual underlying 
hyperlink is entirely different. One way this can happen is that the link is 
hidden in the instrText field.

We should pull out these links.

Thanks to Mike Flester on the user list for sharing a triggering file and for 
raising this issue.

https://lists.apache.org/thread/r85h0wrqknszxhklmy9q5o4wzb8gwgcg



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to