[ https://issues.apache.org/jira/browse/TIKA-4350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17898240#comment-17898240 ]
ASF GitHub Bot commented on TIKA-4350: -------------------------------------- sebastian-nagel opened a new pull request, #2045: URL: https://github.com/apache/tika/pull/2045 Trivial solution adding `<iframe>` as a `root-XML` hint, analogous to `<frameset>`. > HTML snippet containing <iframe> as root element erroneously recognized as > application/xml > ------------------------------------------------------------------------------------------ > > Key: TIKA-4350 > URL: https://issues.apache.org/jira/browse/TIKA-4350 > Project: Tika > Issue Type: Bug > Components: detector, mime > Affects Versions: 3.0.0 > Reporter: Sebastian Nagel > Priority: Major > > A HTML snippet containing an <iframe> element as document root is erroneously > recognized as \{{application/xml}}. > This issue was reported on the Nutch user mailing list for Nutch 1.19 using > Tika 2.3.0: > [https://lists.apache.org/thread/fhhp1p6y4ttxmplvz1ohk3wwjz25ozbc] > The problem is reproducible with Tika 3.0.0 -- This message was sent by Atlassian Jira (v8.20.10#820010)