[ https://issues.apache.org/jira/browse/TIKA-2563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tim Allison resolved TIKA-2563. ------------------------------- Resolution: Fixed Assignee: Tim Allison Fix Version/s: 2.0.0 1.18 > Extract embedded objects in HTML and javascript > ----------------------------------------------- > > Key: TIKA-2563 > URL: https://issues.apache.org/jira/browse/TIKA-2563 > Project: Tika > Issue Type: Improvement > Reporter: Tim Allison > Assignee: Tim Allison > Priority: Trivial > Fix For: 1.18, 2.0.0 > > Attachments: consumentenbond.html, testHTML_embedded_img.html > > > Files (esp images) and other objects can be embedded in html/css/javascript > with the [data: uri scheme|https://en.wikipedia.org/wiki/Data_URI_scheme]. > We should extract those like any other embedded file. -- This message was sent by Atlassian JIRA (v7.6.3#76005)