I think a stronger case needs to be made for adding a new builtin type to support this. Can you provide concrete use-cases? Why can't dates outside of the one representable by int64 be truncated (even for nano precision 64-bits max value is is over 200 years in the future)? It seems like in most cases values at the nanosecond level that are outside the values representable by 64-bits, are generally sentinel values.
FWIW, Parquet had an int96 type that was used for timestamps but it has been deprecated [1] in favor of int64 nanos. -Micah [1] https://issues.apache.org/jira/browse/PARQUET-323 On Tue, Aug 4, 2020 at 8:52 PM Fan Liya <liya.fa...@gmail.com> wrote: > Hi Ji, > > This sounds like a universal requirement, as 64-bit is not sufficient to > hold the precision for nano-second. > > For the extension type, we have two choices: > 1. Extending struct(int64, int32), which represents the design of SoA > (Struct of Arrays). > 2. Extending fixed width binary(12), which represents the design of AoS > (Array of Structs) > > Given the universal requirement, I'd prefer a new type. > > Best, > Liya Fan > > > On Wed, Aug 5, 2020 at 11:18 AM Ji Liu <tianc...@apache.org> wrote: > > > Hi all, > > > > Now in Arrow Timestamp type, it support different TimeUnit(seconds, > > milliseconds, microseconds, nanoseconds) with int64 type for storage. In > > most cases this is enough, but if the timestamp value range of external > > system exceeds int64_t::max, then it's impossible to directly convert to > > Arrow Timestamp, consider the following user case: > > > > A timestamp in other system with int64 + int32(stores milliseconds and > > nanoseconds) can represent data from 0000-00-00 to 9999-12-31 > > 23:59:59.999999999, if we want to convert type like this, how should we > do? > > One probably create an extension type with struct(int64, int32) for > > storage. > > > > Besides ExtensionType, are we considering extending our Timestamp for > wider > > range or maybe a new type for cases above? > > > > > > Thanks, > > Ji Liu > > >