[ https://issues.apache.org/jira/browse/TIKA-3493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
David Pilato updated TIKA-3493: ------------------------------- Attachment: (was: Test_case_to_demo_the_change_with_Tika_1_x.patch) > dcterms:created date depends on the current TimeZone in RTF documents > --------------------------------------------------------------------- > > Key: TIKA-3493 > URL: https://issues.apache.org/jira/browse/TIKA-3493 > Project: Tika > Issue Type: Bug > Components: parser > Affects Versions: 2.0.0 > Reporter: David Pilato > Priority: Minor > Attachments: Test_case_to_demo_the_change_with_Tika_1_x1.patch > > > {color:#333333}I'm migrating an existing project to Tika 2.0.0. > I'm seeing a strange behavior. > TL;DR: the created date of the document changes depending on the timezone. > Long story: > I have a unit test which extracts content and metadata from a [RTF > document|[https://github.com/dadoonet/fscrawler/raw/master/test-documents/src/main/resources/documents/test.rtf]]. > When using Tika 1.27, whatever the timezone defined for my JVM, I'm always > getting the same value for "dcterms:created": "2016-07-07T13:38:00Z". > When running the same test with Tika 2.0.0, the date changes depending on the > Timezone. > For example: > {color} > * {color:#333333}Asia/Sakhalin gives dcterms:created=2016-07-06T23:38:00Z > {color} > * {color:#333333}Asia/Colombo gives dcterms:created=2016-07-07T05:08:00Z > {color} > * {color:#333333}Europe/Stockholm gives dcterms:created=2016-07-07T08:38:00Z > {color} > > {color:#333333}I don't know if it's a bug or expected. May be the RTF format > does not specify the Timezone. > I'm surprised that I don't see the same behavior for Office documents > actually. > {color} -- This message was sent by Atlassian Jira (v8.3.4#803005)