Thank you, that makes sense. Whilst one definitely could argue that
round-tripping dates ~6 million years inthe future, when Date32 would
overflow, is perhaps a little esoteric, I can see how it would be
different from a timestamp. The use of milliseconds does seem
unnecessarily confusing, but I suppose the opportunity for changing that
has long since passed.
On 22/02/2023 15:13, David Li wrote:
I believe it was meant to round-trip with Java and/or other languages
(JavaScript, others?) that basically just stored dates as timestamps. (Java
gained a proper date type eventually, at least.)
Semantically it is a date, not a timestamp. Ignore the representation. Hence,
it represents the latter: the date, not the timestamp. And so if the physical
value is not on a day boundary, it is invalid. Arrow C++ validates this [1].
[1]:https://github.com/apache/arrow/blob/bda727f9fe56e0abd4fa2770d7175c9074306573/cpp/src/arrow/array/validate.cc#L172-L190
On Wed, Feb 22, 2023, at 09:59, Lee, David (PAG) wrote:
class Date64Type : publicarrow::DateType
<https://arrow.apache.org/docs/cpp/api/datatype.html#_CPPv4N5arrow8DateTypeE>¶<https://arrow.apache.org/docs/cpp/api/datatype.html#_CPPv4N5arrow10Date64TypeE>
*#include <arrow/type.h>*
Concrete type class for 64-bit date data (as number of milliseconds since UNIX
epoch)
Timestamps are a different logical type with precision and timezone support.
Data Types — Apache Arrow
v11.0.0<https://arrow.apache.org/docs/cpp/api/datatype.html>
arrow.apache.org<https://arrow.apache.org/docs/cpp/api/datatype.html>
<https://arrow.apache.org/docs/cpp/api/datatype.html>
On Feb 22, 2023, at 4:27 AM, Raphael
Taylor-Davies<[email protected]> wrote:
External Email: Use caution with links and attachments
Hi,
The Date64 type is a source of common confusion for myself and the
community and I wonder if someone might be able to shed some light on
its purpose.
In particular:
- It cannot be round-tripped through parquet
- It is unclear how it is different from Timestamp(TimeUnit::Millisecond)
- Does it represent the quantity 2020-03-19 00:00:00 or 2020-03-19, i.e.
without the time
- What should be done if the value is not divisible by number of
milliseconds in a day
Any clarifications would be most appreciated.
Kind Regards,
Raphael
This message may contain information that is confidential or privileged. If you are not the intended recipient, please advise the sender immediately and delete this message. Seehttp://www.blackrock.com/corporate/compliance/email-disclaimers for further information. Please refer tohttp://www.blackrock.com/corporate/compliance/privacy-policy for more information about BlackRock’s Privacy Policy.
For a list of BlackRock's office addresses worldwide,
seehttp://www.blackrock.com/corporate/about-us/contacts-locations.
© 2023 BlackRock, Inc. All rights reserved.
*Attachments:*
• favicon.ico