[ 
https://issues.apache.org/jira/browse/TIKA-2581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewan Mellor updated TIKA-2581:
------------------------------
    Description: 
TesseractOCRParserTest.testOCROutputsHOCR fails with Tesseract 4.0.

With 3.x, the output is <span>Happy</span> but with 4.0 the output is 
<span><strong>Happy</strong></span>.

 

  was:
TesseractOCRParserTest.testOCROutputsHOCR fails with Tesseract 4.0.

With 3.x, the output is `<span>Happy</span>` but with 4.0 the output is 
`<span><strong>Happy</strong></span>`.

 


> testOCROutputsHOCR fails with Tesseract 4.0
> -------------------------------------------
>
>                 Key: TIKA-2581
>                 URL: https://issues.apache.org/jira/browse/TIKA-2581
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 1.17
>            Reporter: Ewan Mellor
>            Priority: Minor
>
> TesseractOCRParserTest.testOCROutputsHOCR fails with Tesseract 4.0.
> With 3.x, the output is <span>Happy</span> but with 4.0 the output is 
> <span><strong>Happy</strong></span>.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to