[ 
https://issues.apache.org/jira/browse/TIKA-4411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17950951#comment-17950951
 ] 

Tim Allison edited comment on TIKA-4411 at 5/12/25 12:52 PM:
-------------------------------------------------------------

K. The xhtml issue appears to be a difference in how jsoup 1.18.3 and jsoup 
1.19.1 handle broken xhtml.

Note: the change in jsoup happened between 1.18.3 and 1.19.1 -- we're now using 
the latest version of jsoup: 1.20.1, which still has the 1.19.1 behavior.

The publicly available example file is here: 
https://bug1554250.bmoattachments.org/attachment.cgi?id=9068831

If anyone wants to dig into this and open an issue on jsoup (if there's a 
problem?!), please go for it.

I don't think this is a significant enough difference to warrant downgrading 
jsoup to 1.18.3.

I'll start the 3.2.0 release process shortly. I'm happy to respin if anyone 
disagrees or would prefer a different solution. Or, of course, if you notice 
any other problems!

Onwards!


was (Author: talli...@mitre.org):
K. The xhtml issue appears to be a difference in how jsoup 1.18.3 and jsoup 
1.19.1 handle broken xhtml.

Note: the change in jsoup happened between 1.18.3 and 1.19.1 -- we're now using 
the latest version of jsoup: 1.20.1, which still has the 1.19.1 behavior.

The publicly available example file is here: 
https://bug1554250.bmoattachments.org/attachment.cgi?id=9068831

If anyone wants to dig into this and open an issue on jsoup (if there's a 
problem?!), please go for it.

I don't think this is a significant enough difference to warrant downgrading 
jsoup to 1.18.3.

I'll start the 3.2.0 release process shortly. I'm happy to respin if anyone 
disagrees or would prefer a different solution.

Onwards!

> Run the 3.2.0 release process
> -----------------------------
>
>                 Key: TIKA-4411
>                 URL: https://issues.apache.org/jira/browse/TIKA-4411
>             Project: Tika
>          Issue Type: Task
>            Reporter: Tim Allison
>            Priority: Major
>             Fix For: 3.2.0
>
>         Attachments: reports-3.2.0-pre-rc1.tgz, reports-3.2.0.tgz
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to