[ 
https://issues.apache.org/jira/browse/TIKA-3493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17385445#comment-17385445
 ] 

David Pilato edited comment on TIKA-3493 at 7/22/21, 11:59 AM:
---------------------------------------------------------------

I attached a patch which adds a unit test. 

It is failing with:
{code:java}
org.junit.ComparisonFailure:
 Expected :2006-05-18T07:19:00Z
 Actual :2006-05-18T10:19:00Z{code}


was (Author: dadoonet):
I attached a patch which adds a unit test. 

It is failing with:

{{org.junit.ComparisonFailure: }}
{{Expected :2006-05-18T07:19:00Z}}
{{Actual :2006-05-18T10:19:00Z}}

> dcterms:created date depends on the current TimeZone in RTF documents
> ---------------------------------------------------------------------
>
>                 Key: TIKA-3493
>                 URL: https://issues.apache.org/jira/browse/TIKA-3493
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 2.0.0
>            Reporter: David Pilato
>            Priority: Minor
>         Attachments: Test_case_to_demo_the_change_with_Tika_1_x1.patch
>
>
> {color:#333333}I'm migrating an existing project to Tika 2.0.0.
> I'm seeing a strange behavior.
> TL;DR: the created date of the document changes depending on the timezone.
> Long story:
> I have a unit test which extracts content and metadata from a [RTF 
> document|[https://github.com/dadoonet/fscrawler/raw/master/test-documents/src/main/resources/documents/test.rtf]].
> When using Tika 1.27, whatever the timezone defined for my JVM, I'm always 
> getting the same value for "dcterms:created": "2016-07-07T13:38:00Z".
> When running the same test with Tika 2.0.0, the date changes depending on the 
> Timezone.
> For example:
> {color}
>  * {color:#333333}Asia/Sakhalin gives dcterms:created=2016-07-06T23:38:00Z
> {color}
>  * {color:#333333}Asia/Colombo gives dcterms:created=2016-07-07T05:08:00Z
> {color}
>  * {color:#333333}Europe/Stockholm gives dcterms:created=2016-07-07T08:38:00Z
> {color}
>  
> {color:#333333}I don't know if it's a bug or expected. May be the RTF format 
> does not specify the Timezone.
> I'm surprised that I don't see the same behavior for Office documents 
> actually.
> {color}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to