[
https://issues.apache.org/jira/browse/TIKA-4043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17731671#comment-17731671
]
Tim Allison commented on TIKA-4043:
-----------------------------------
[~grossws], whenever you have a chance, would you be able to confirm that these
changes fixed your build? Please reopen if more work is required. Thank you!
> Fix build for variations in tesseract and timezone info in RTFs
> ---------------------------------------------------------------
>
> Key: TIKA-4043
> URL: https://issues.apache.org/jira/browse/TIKA-4043
> Project: Tika
> Issue Type: Task
> Reporter: Tim Allison
> Priority: Major
>
> From [~grossws]:
> > * OCR (tesseract) multipage test is still the same, it extracts "Page?2"
> > instead of "Page 2" on my laptop;
> > * RTFParserTest testMetaDataCounts fails because of different time zone
> > since RTF format itself has only local date/time in meta and I fall into
> > different size of midnight with my local time (known issue, requires some
> > changes in metadata to handle correctly). When building with TZ=UTC works
> > fine.
> We should fix these.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)