[ https://issues.apache.org/jira/browse/TIKA-4373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17922306#comment-17922306 ]
Tilman Hausherr commented on TIKA-4373: --------------------------------------- I found some huge differences with some HTML files, but these are because of incorrect HTML, e.g. 032223.html and 876421.html . > Regression tests for 3.1.0 release > ---------------------------------- > > Key: TIKA-4373 > URL: https://issues.apache.org/jira/browse/TIKA-4373 > Project: Tika > Issue Type: Task > Reporter: Tim Allison > Assignee: Tim Allison > Priority: Major > Fix For: 3.1.0 > > Attachments: S53SZFZ2FBOZIVTX3HVP4D4XKHKPEMQQ.csv, > filter_md5_suc_url.json, reports-tika-3.0.0-v-3.1.0-rc1.tgz, > reports_tika-3.0-vs-3.1.tgz > > -- This message was sent by Atlassian Jira (v8.20.10#820010)