Thank you, that makes sense. Whilst one definitely could argue that round-tripping dates ~6 million years inthe future, when Date32 would overflow, is perhaps a little esoteric, I can see how it would be different from a timestamp. The use of milliseconds does seem unnecessarily confusing, but I suppose the opportunity for changing that has long since passed.

On 22/02/2023 15:13, David Li wrote:
I believe it was meant to round-trip with Java and/or other languages 
(JavaScript, others?) that basically just stored dates as timestamps. (Java 
gained a proper date type eventually, at least.)

Semantically it is a date, not a timestamp. Ignore the representation. Hence, 
it represents the latter: the date, not the timestamp. And so if the physical 
value is not on a day boundary, it is invalid. Arrow C++ validates this [1].

[1]:https://github.com/apache/arrow/blob/bda727f9fe56e0abd4fa2770d7175c9074306573/cpp/src/arrow/array/validate.cc#L172-L190

On Wed, Feb 22, 2023, at 09:59, Lee, David (PAG) wrote:



class Date64Type : publicarrow::DateType  
<https://arrow.apache.org/docs/cpp/api/datatype.html#_CPPv4N5arrow8DateTypeE>¶<https://arrow.apache.org/docs/cpp/api/datatype.html#_CPPv4N5arrow10Date64TypeE>
*#include <arrow/type.h>*
Concrete type class for 64-bit date data (as number of milliseconds since UNIX 
epoch)


Timestamps are a different logical type with precision and timezone support.

Data Types — Apache Arrow 
v11.0.0<https://arrow.apache.org/docs/cpp/api/datatype.html>
arrow.apache.org<https://arrow.apache.org/docs/cpp/api/datatype.html>
  <https://arrow.apache.org/docs/cpp/api/datatype.html>
On Feb 22, 2023, at 4:27 AM, Raphael 
Taylor-Davies<[email protected]>  wrote:
External Email: Use caution with links and attachments


Hi,

The Date64 type is a source of common confusion for myself and the
community and I wonder if someone might be able to shed some light on
its purpose.

In particular:

- It cannot be round-tripped through parquet
- It is unclear how it is different from Timestamp(TimeUnit::Millisecond)
- Does it represent the quantity 2020-03-19 00:00:00 or 2020-03-19, i.e.
without the time
- What should be done if the value is not divisible by number of
milliseconds in a day

Any clarifications would be most appreciated.

Kind Regards,

Raphael


This message may contain information that is confidential or privileged. If you are not the intended recipient, please advise the sender immediately and delete this message. Seehttp://www.blackrock.com/corporate/compliance/email-disclaimers for further information. Please refer tohttp://www.blackrock.com/corporate/compliance/privacy-policy for more information about BlackRock’s Privacy Policy.






For a list of BlackRock's office addresses worldwide, 
seehttp://www.blackrock.com/corporate/about-us/contacts-locations.

© 2023 BlackRock, Inc. All rights reserved.



*Attachments:*
  • favicon.ico

Reply via email to