On Mon, Aug 10, 2020 at 6:19 PM Eric Erhardt
<eric.erha...@microsoft.com.invalid> wrote:
>
> I don't understand what the value of the Date64 type is over using Date32:
>
> From https://github.com/apache/arrow/blob/master/format/Schema.fbs#L193-L206
>
> enum DateUnit: short {
>   DAY,
>   MILLISECOND
> }
>
> /// Date is either a 32-bit or 64-bit type representing elapsed time since 
> UNIX
> /// epoch (1970-01-01), stored in either of two units:
> ///
> /// * Milliseconds (64 bits) indicating UNIX time elapsed since the epoch (no
> ///   leap seconds), where the values are evenly divisible by 86400000
> /// * Days (32 bits) since the UNIX epoch
> table Date {
>   unit: DateUnit = MILLISECOND;
> }
>
> If the spec specifies that Date64 must be evenly divisible by 86400000, I 
> don't see the point in using millisecond units. I can't represent any 
> different information in my data. So why would I take up double the space to 
> represent the same information?
>
> Can someone explain when Date64 is useful?

As I recall the motivation of the date64 type is to allow for
zero-copy of dates-as-milliseconds, which are used in some other
libraries / platforms. For example Joda in uses a millisecond-based
"instant". I'm not sure which others do off hand.

That said, it would be perfectly reasonable for a data processing
system to use date32 throughout and convert any date64 data to date32
if desired.

> Eric

Reply via email to