Ah, that makes sense to wait then.

On Thu, May 6, 2021 at 10:55 AM Micah Kornfield <emkornfi...@gmail.com> wrote:
>
> I'll address the feedback.  I think in the past we've waited for 
> implementations in java and c++ with integration tests before formally 
> voting.  If there is no more feedback I can start looking at implementations 
> (happy to have help)
>
> On Thursday, May 6, 2021, Wes McKinney <wesmck...@gmail.com> wrote:
>>
>> The PR looks good. I just left some comments about typos. I would say
>> it's probably about time to call a vote. Anywhere else where we should
>> be soliciting feedback?
>>
>> On Mon, May 3, 2021 at 2:17 PM Jacek Pliszka <jacek.plis...@gmail.com> wrote:
>> >
>> > Good idea, I've created JIRA issue:
>> >
>> > https://issues.apache.org/jira/browse/ARROW-12637
>> >
>> > And named it range to avoid confusion with intervals...
>> > Though confusion will stay as it is called interval in Pandas and in
>> > logic (Allen's interval algebra)
>> >
>> > BR,
>> >
>> > Jacek
>> >
>> > pon., 3 maj 2021 o 18:05 Micah Kornfield <emkornfi...@gmail.com> 
>> > napisał(a):
>> > >
>> > > Hi Jacek,
>> > > This seems like reasonable functionality.  I think the probably comes in
>> > > two parts:
>> > > 1.  This might be a good candidate for a "Well Known"/Officially 
>> > > supported
>> > > Extension type. I can think of a few different representations but I 
>> > > would
>> > > guess something like Struct[start: T, struct: end]] with well defined
>> > > extension metadata to define open/closed on start and end might be the 
>> > > best
>> > > (we should probably spin this off into a separate discussion thread).
>> > > 2.  Adding the right computation Kernels to work with the type.
>> > >
>> > > Do you want to start a new thread or open up some JIRAs to track this 
>> > > work?
>> > >
>> > > Thanks,
>> > > Micah
>> > >
>> > > On Mon, May 3, 2021 at 5:32 AM Jacek Pliszka <jacek.plis...@gmail.com>
>> > > wrote:
>> > >
>> > > > Sorry, my mistake.
>> > > >
>> > > > You are right - I meant anchored intervals as in pandas - ones with
>> > > > defined start and end - and I think many future users will make the
>> > > > same mistake.
>> > > >
>> > > > I would love to be able to do fast overlap joins on arrow level.
>> > > >
>> > > > Best Regards,
>> > > >
>> > > > Jacek
>> > > >
>> > > >
>> > > >
>> > > >
>> > > > niedz., 2 maj 2021 o 23:06 Wes McKinney <wesmck...@gmail.com> 
>> > > > napisał(a):
>> > > > >
>> > > > > I also don't understand the comment about closed / open / semi-open
>> > > > > intervals. Perhaps there is a confusion, since "interval" as we mean
>> > > > > it here is called a "time delta" in some other projects. An interval
>> > > > > here does not refer to a time span with a distinct start and end 
>> > > > > point
>> > > > > (I understand this might be confusing to a pandas user since pandas
>> > > > > has an interval data type where each value is a tuple of arbitrary
>> > > > > start/end).
>> > > > >
>> > > > > On Sun, May 2, 2021 at 3:46 PM Micah Kornfield 
>> > > > > <emkornfi...@gmail.com>
>> > > > wrote:
>> > > > > >
>> > > > > > Hi Jacek,
>> > > > > > I'm not sure I fully understand the proposal, could you elaborate 
>> > > > > > with
>> > > > more
>> > > > > > examples/details?  For instance DAY_TIME isn't just a UINT64, it
>> > > > actually
>> > > > > > contains 2 seperate fields (days and milliseconds).
>> > > > > >
>> > > > > > In terms of closed vs half-open, in my limited understanding, that 
>> > > > > > is
>> > > > more
>> > > > > > a concern of functions using interval types rather than the type
>> > > > itself.
>> > > > > > For instance a quick search of postgres [1] docs only talks about
>> > > > half-open
>> > > > > > in relation to the "Overlaps" operator
>> > > > > >
>> > > > > > Thanks,
>> > > > > > -Micah
>> > > > > >
>> > > > > > [1] https://www.postgresql.org/docs/9.1/functions-datetime.html
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > > On Sun, May 2, 2021 at 12:25 AM Jacek Pliszka 
>> > > > > > <jacek.plis...@gmail.com
>> > > > >
>> > > > > > wrote:
>> > > > > >
>> > > > > > > Hi!
>> > > > > > >
>> > > > > > > I wonder if it were possible to have generic interval with 
>> > > > > > > integers
>> > > > of
>> > > > > > > specified size just to have common base for interval arithmetic.
>> > > > > > >
>> > > > > > > Then user can convert their period to ordinals and use the 
>> > > > > > > arithmetic
>> > > > > > > (joining, deoverlapping, common parts, explosion etc.).
>> > > > > > >
>> > > > > > > So YEAR_MONTH and DAY_TIME would be just special cases of
>> > > > > > > INTERVAL_UINT32 and INTERVAL_UINT64
>> > > > > > >
>> > > > > > > Also I believe it is worth to state whether there are only closed
>> > > > > > > intervals or open/semi-open ones are allowed as well.
>> > > > > > >
>> > > > > > > I believe I am just one of many reinventing the wheel here and
>> > > > writing
>> > > > > > > own versions of the above.
>> > > > > > >
>> > > > > > > BR,
>> > > > > > >
>> > > > > > > Jacek
>> > > > > > >
>> > > > > > >
>> > > > > > > pt., 2 kwi 2021 o 21:53 Micah Kornfield <emkornfi...@gmail.com>
>> > > > > > > napisał(a):
>> > > > > > > >
>> > > > > > > > Andrew is the use-case you have simply postgres compatibility 
>> > > > > > > > or
>> > > > is it
>> > > > > > > more
>> > > > > > > > extensive?
>> > > > > > > >
>> > > > > > > > One potential problem with combining Month and Day fields, is 
>> > > > > > > > that
>> > > > the
>> > > > > > > type
>> > > > > > > > no longer has a defined sort order (the existing 
>> > > > > > > > Day-Millisecond
>> > > > type
>> > > > > > > > without assumptions, in particular because I don't think today
>> > > > there is
>> > > > > > > an
>> > > > > > > > explicit constraint on the bounds for the millisecond 
>> > > > > > > > component).
>> > > > > > > >
>> > > > > > > > -Micah
>> > > > > > > >
>> > > > > > > >
>> > > > > > > >
>> > > > > > > > On Wed, Mar 31, 2021 at 9:03 AM Antoine Pitrou 
>> > > > > > > > <anto...@python.org
>> > > > >
>> > > > > > > wrote:
>> > > > > > > >
>> > > > > > > > >
>> > > > > > > > > Le 31/03/2021 à 17:55, Micah Kornfield a écrit :
>> > > > > > > > > > Thanks for the feedback.  A couple of points here and some
>> > > > responses
>> > > > > > > > > below.
>> > > > > > > > > >
>> > > > > > > > > > * One other question is whether the Nanoseconds should
>> > > > actually be
>> > > > > > > > > > configurable (i.e. use milliseconds or microseconds).  I 
>> > > > > > > > > > would
>> > > > lean
>> > > > > > > > > towards
>> > > > > > > > > > no.
>> > > > > > > > >
>> > > > > > > > > Same for me.
>> > > > > > > > >
>> > > > > > > > > > * I'm also still not 100% convinced we need this as a first
>> > > > class
>> > > > > > > type in
>> > > > > > > > > > arrow or if we should be looking more closely at the Struct
>> > > > (in the
>> > > > > > > Arrow
>> > > > > > > > > > sense) based implementation.  In the future where 
>> > > > > > > > > > alternative
>> > > > > > > encodings
>> > > > > > > > > are
>> > > > > > > > > > supported, this could allow for much smaller footprints for
>> > > > this
>> > > > > > > type.
>> > > > > > > > >
>> > > > > > > > > Having a "packed" first class type allows for better locality
>> > > > when
>> > > > > > > > > accessing data.  It doesn't sound very likely that you'd 
>> > > > > > > > > access
>> > > > only
>> > > > > > > one
>> > > > > > > > > component of the interval.
>> > > > > > > > >
>> > > > > > > > > But I have no idea how important this is, and temporal 
>> > > > > > > > > datetypes
>> > > > are
>> > > > > > > > > generally cumbersome to add support for (conversions, 
>> > > > > > > > > arithmetic,
>> > > > > > > etc.),
>> > > > > > > > > so it would be nice to avoid adding too many of them :-)
>> > > > > > > > >
>> > > > > > > > > Regards
>> > > > > > > > >
>> > > > > > > > > Antoine.
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > > The 3
>> > > > > > > > > >> field implementation doesn't seem to have any way to 
>> > > > > > > > > >> represent
>> > > > > > > integral
>> > > > > > > > > >> days, so I am also not sure about that one.
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > > Sorry this was an email gaffe.  I intended Month (32 bit 
>> > > > > > > > > > int),
>> > > > Day
>> > > > > > > (32
>> > > > > > > > > bit
>> > > > > > > > > > int), Nanosecond (64 bit int).
>> > > > > > > > > >
>> > > > > > > > > > OTOH I don't really understand the point of supporting "the
>> > > > most
>> > > > > > > > > >> reasonable ranges for Year, Month and Nanoseconds
>> > > > independently".
>> > > > > > > What
>> > > > > > > > > >> does it bring to encode more than one month in the 
>> > > > > > > > > >> nanoseconds
>> > > > > > > field?
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > > I'm happy with simplicity.   In the past there has been 
>> > > > > > > > > > some
>> > > > > > > reference to
>> > > > > > > > > > people wanting to store very large timestamps (fall out of
>> > > > > > > Nanoseconds
>> > > > > > > > > max
>> > > > > > > > > > representable value) but we've concluded that this wasn't
>> > > > something
>> > > > > > > that
>> > > > > > > > > we
>> > > > > > > > > > wanted to really support.
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > > On Wed, Mar 31, 2021 at 4:49 AM Antoine Pitrou <
>> > > > anto...@python.org>
>> > > > > > > > > wrote:
>> > > > > > > > > >
>> > > > > > > > > >>
>> > > > > > > > > >> I would favour the following characteristics :
>> > > > > > > > > >> - support for nanoseconds (especially as other Arrow 
>> > > > > > > > > >> temporal
>> > > > types
>> > > > > > > > > >> support it)
>> > > > > > > > > >> - easy to handle (which excludes the ZetaSQL 
>> > > > > > > > > >> representtaion
>> > > > IMHO)
>> > > > > > > > > >>
>> > > > > > > > > >> OTOH I don't really understand the point of supporting 
>> > > > > > > > > >> "the
>> > > > most
>> > > > > > > > > >> reasonable ranges for Year, Month and Nanoseconds
>> > > > independently".
>> > > > > > > What
>> > > > > > > > > >> does it bring to encode more than one month in the 
>> > > > > > > > > >> nanoseconds
>> > > > > > > field?
>> > > > > > > > > >> You can already use the Duration type for that.
>> > > > > > > > > >>
>> > > > > > > > > >> Regards
>> > > > > > > > > >>
>> > > > > > > > > >> Antoine.
>> > > > > > > > > >>
>> > > > > > > > > >>
>> > > > > > > > > >> Le 31/03/2021 à 05:48, Micah Kornfield a écrit :
>> > > > > > > > > >>> To follow-up on this conversation I did some analysis on
>> > > > interval
>> > > > > > > > > types:
>> > > > > > > > > >>>
>> > > > > > > > > >>>
>> > > > > > > > > >>
>> > > > > > > > >
>> > > > > > >
>> > > > https://docs.google.com/document/d/1i1E_fdQ_xODZcAhsV11Pfq27O50k679OYHXFJpm9NS0/edit
>> > > > > > > > > >> Please feel free to add more details/systems I missed.
>> > > > > > > > > >>>
>> > > > > > > > > >>> Given the disparate requirements of different systems I
>> > > > think the
>> > > > > > > > > >> following might make sense for official types (if there 
>> > > > > > > > > >> isn't
>> > > > > > > > > consensus, I
>> > > > > > > > > >> might try to contributation extension Array 
>> > > > > > > > > >> implementations
>> > > > for
>> > > > > > > them to
>> > > > > > > > > >> Java and C++/Python separately).
>> > > > > > > > > >>>
>> > > > > > > > > >>> 1.  3 fields: Year (32 bit), Month (32 bit), Nanoseconds 
>> > > > > > > > > >>> (64
>> > > > bit)
>> > > > > > > all
>> > > > > > > > > >> signed.
>> > > > > > > > > >>> 2.  Postgres representation (Downside is it doesn't 
>> > > > > > > > > >>> support
>> > > > > > > > > Nanoseconds,
>> > > > > > > > > >> only microseconds).
>> > > > > > > > > >>> 3.  ZetaSQL implementation (Requires some bit 
>> > > > > > > > > >>> manipulation)
>> > > > but
>> > > > > > > > > supports
>> > > > > > > > > >> the most reasonable ranges for Year, Month and Nanoseconds
>> > > > > > > > > independently.
>> > > > > > > > > >>>
>> > > > > > > > > >>> Thoughts?
>> > > > > > > > > >>>
>> > > > > > > > > >>> Micah
>> > > > > > > > > >>>
>> > > > > > > > > >>> On 2021/02/18 04:30:55 Micah Kornfield wrote:
>> > > > > > > > > >>>>>
>> > > > > > > > > >>>>> I didn’t find any page/documentation on how to do RFC 
>> > > > > > > > > >>>>> in
>> > > > Arrow
>> > > > > > > > > >> protocol,
>> > > > > > > > > >>>>> so can anyone point me to it or PR with email will be
>> > > > enough?
>> > > > > > > > > >>>>
>> > > > > > > > > >>>> That is enough to start discussion.  Before formal
>> > > > acceptance and
>> > > > > > > > > >> merging
>> > > > > > > > > >>>> of the PR there needs to be a Java and C++ 
>> > > > > > > > > >>>> implementations
>> > > > for the
>> > > > > > > > > type
>> > > > > > > > > >>>> that pass integration tests.  At the time this 
>> > > > > > > > > >>>> guideline was
>> > > > > > > > > instituted
>> > > > > > > > > >>>> Java and C++ were considered the "reference"
>> > > > implementations (I
>> > > > > > > think
>> > > > > > > > > >> they
>> > > > > > > > > >>>> still have the most complete integration test coverage).
>> > > > > > > > > >>>>
>> > > > > > > > > >>>> My understanding is that the current modelling of 
>> > > > > > > > > >>>> intervals
>> > > > > > > mimics SQL
>> > > > > > > > > >>>> standards (e.g. SQL Server [1]).  So it would also be 
>> > > > > > > > > >>>> good
>> > > > to step
>> > > > > > > > > back
>> > > > > > > > > >> and
>> > > > > > > > > >>>> understand what problem DF is trying to solve and how it
>> > > > differs
>> > > > > > > from
>> > > > > > > > > >> other
>> > > > > > > > > >>>> SQL implementations.  I'd be hesitant to accept COMPLEX 
>> > > > > > > > > >>>> as
>> > > > a new
>> > > > > > > type
>> > > > > > > > > >>>> without a much deeper analysis into calendar 
>> > > > > > > > > >>>> representations
>> > > > > > > within
>> > > > > > > > > >> Arrow
>> > > > > > > > > >>>> and how they relate to other existing systems (e.g. Hive
>> > > > and some
>> > > > > > > > > >>>> assortment of existing SQL databases).  For instance the
>> > > > current
>> > > > > > > > > >> modelling
>> > > > > > > > > >>>> of timestamps does not lend itself to constructing a 
>> > > > > > > > > >>>> COMPLEX
>> > > > > > > interval
>> > > > > > > > > >> type
>> > > > > > > > > >>>> particularly well. (Duration was introduced for this
>> > > > reason).
>> > > > > > > > > >>>>
>> > > > > > > > > >>>> I think both Wes's suggestion of FixedSizeBinary and
>> > > > Andrew's of
>> > > > > > > > > >> composing
>> > > > > > > > > >>>> the with a struct are good stop-gaps.  These obviously 
>> > > > > > > > > >>>> have
>> > > > > > > different
>> > > > > > > > > >>>> trade-offs.  Ultimately, it would be good to define 
>> > > > > > > > > >>>> common
>> > > > > > > extension
>> > > > > > > > > >> types
>> > > > > > > > > >>>> that can represent this use-case if there really is 
>> > > > > > > > > >>>> demand
>> > > > for it
>> > > > > > > (if
>> > > > > > > > > it
>> > > > > > > > > >>>> doesn't become a top level type).
>> > > > > > > > > >>>>
>> > > > > > > > > >>>> [1]
>> > > > > > > > > >>>>
>> > > > > > > > > >>
>> > > > > > > > >
>> > > > > > >
>> > > > https://docs.microsoft.com/en-us/sql/odbc/reference/appendixes/interval-data-types?view=sql-server-ver15
>> > > > > > > > > >>>>
>> > > > > > > > > >>>> -Micah
>> > > > > > > > > >>>>
>> > > > > > > > > >>>> On Wed, Feb 17, 2021 at 2:05 PM Andrew Lamb <
>> > > > al...@influxdata.com
>> > > > > > > >
>> > > > > > > > > >> wrote:
>> > > > > > > > > >>>>
>> > > > > > > > > >>>>> That is a great suggestion Wes, thank you.
>> > > > > > > > > >>>>>
>> > > > > > > > > >>>>> I wonder if we could get away with a 128 bit
>> > > > representation that
>> > > > > > > is
>> > > > > > > > > the
>> > > > > > > > > >>>>> concatenation of the two existing interval types
>> > > > > > > > > (YearMonth)(DayTime).
>> > > > > > > > > >> Or
>> > > > > > > > > >>>>> maybe even define a `struct` type with those fields 
>> > > > > > > > > >>>>> that
>> > > > is used
>> > > > > > > by
>> > > > > > > > > >>>>> DataFusion.
>> > > > > > > > > >>>>>
>> > > > > > > > > >>>>> Basically, given our reading of the Arrow spec[1], it 
>> > > > > > > > > >>>>> is
>> > > > > > > currently
>> > > > > > > > > not
>> > > > > > > > > >>>>> possible to precisely represent an interval that has 
>> > > > > > > > > >>>>> both
>> > > > > > > monthly and
>> > > > > > > > > >>>>> sub-montly granularity.
>> > > > > > > > > >>>>>
>> > > > > > > > > >>>>> As Dmtry says, if you have an interval seemingly simple
>> > > > like  1
>> > > > > > > > > month,
>> > > > > > > > > >> 1
>> > > > > > > > > >>>>> day
>> > > > > > > > > >>>>>
>> > > > > > > > > >>>>> Using IntervalUnit(YEAR_MONTH) can't represent the 1 
>> > > > > > > > > >>>>> day
>> > > > > > > > > >>>>> Using IntervalUnit(DAY_TIME) can't represent the month 
>> > > > > > > > > >>>>> as
>> > > > > > > different
>> > > > > > > > > >> months
>> > > > > > > > > >>>>> have different numbers of days
>> > > > > > > > > >>>>>
>> > > > > > > > > >>>>> [1]
>> > > > > > > > > >>>>>
>> > > > > > > > > >>
>> > > > > > >
>> > > > https://github.com/apache/arrow/blob/master/format/Schema.fbs#L249-L260
>> > > > > > > > > >>>>>
>> > > > > > > > > >>>>>
>> > > > > > > > > >>>>> On Wed, Feb 17, 2021 at 5:01 PM Wes McKinney <
>> > > > > > > wesmck...@gmail.com>
>> > > > > > > > > >> wrote:
>> > > > > > > > > >>>>>
>> > > > > > > > > >>>>>> On Wed, Feb 17, 2021 at 3:46 PM <t...@dmtry.me> wrote:
>> > > > > > > > > >>>>>>>
>> > > > > > > > > >>>>>>>> It's unclear to me that this needs to be introduced
>> > > > into the
>> > > > > > > > > >>>>> top-level
>> > > > > > > > > >>>>>>>
>> > > > > > > > > >>>>>>> Similar thing to columnar format, How to store 
>> > > > > > > > > >>>>>>> interval
>> > > > like 1
>> > > > > > > > > month
>> > > > > > > > > >> 1
>> > > > > > > > > >>>>>> day 1 hour? It’s not possible to do it without 
>> > > > > > > > > >>>>>> converting
>> > > > 1
>> > > > > > > month to
>> > > > > > > > > >> 30
>> > > > > > > > > >>>>>> days, which is a bad way.
>> > > > > > > > > >>>>>>>
>> > > > > > > > > >>>>>>
>> > > > > > > > > >>>>>> Presumably you can represent a complex interval in a 
>> > > > > > > > > >>>>>> fixed
>> > > > > > > number of
>> > > > > > > > > >>>>>> bytes, and then embed the data in a FixedSizeBinary 
>> > > > > > > > > >>>>>> type.
>> > > > You
>> > > > > > > can
>> > > > > > > > > >>>>>> adorn this type with extension type metadata so that
>> > > > DataFusion
>> > > > > > > can
>> > > > > > > > > >>>>>> then apply Interval semantics to it. This could also
>> > > > serve as an
>> > > > > > > > > >>>>>> interim strategy for you to proceed with 
>> > > > > > > > > >>>>>> implementation
>> > > > while
>> > > > > > > > > >>>>>> proposing a top-level type to the Arrow format (which 
>> > > > > > > > > >>>>>> may
>> > > > or
>> > > > > > > may not
>> > > > > > > > > >>>>>> be accepting) so you aren't blocked on acceptance of
>> > > > changes
>> > > > > > > into
>> > > > > > > > > >>>>>> Schema.fbs.
>> > > > > > > > > >>>>>>
>> > > > > > > > > >>>>>>>> On 17 Feb 2021, at 21:02, Wes McKinney <
>> > > > wesmck...@gmail.com>
>> > > > > > > > > wrote:
>> > > > > > > > > >>>>>>>>
>> > > > > > > > > >>>>>>>> It's unclear to me that this needs to be introduced
>> > > > into the
>> > > > > > > > > >>>>> top-level
>> > > > > > > > > >>>>>>>> columnar format without more analysis — have you
>> > > > considered
>> > > > > > > > > >>>>>>>> implementing this for DataFusion as an extension 
>> > > > > > > > > >>>>>>>> type
>> > > > for the
>> > > > > > > time
>> > > > > > > > > >>>>>>>> being?
>> > > > > > > > > >>>>>>>>
>> > > > > > > > > >>>>>>>> On Wed, Feb 17, 2021 at 11:59 AM t...@dmtry.me 
>> > > > > > > > > >>>>>>>> <mailto:
>> > > > > > > > > >> t...@dmtry.me
>> > > > > > > > > >>>>>>
>> > > > > > > > > >>>>>> <t...@dmtry.me <mailto:t...@dmtry.me>> wrote:
>> > > > > > > > > >>>>>>>>>
>> > > > > > > > > >>>>>>>>> Hi,
>> > > > > > > > > >>>>>>>>>
>> > > > > > > > > >>>>>>>>> For now, There are only two types of IntervalUnit
>> > > > inside
>> > > > > > > Arrow:
>> > > > > > > > > >>>>>>>>>
>> > > > > > > > > >>>>>>>>> - YearMonth - month stored as int32
>> > > > > > > > > >>>>>>>>> - DayTime - days as int32 and time in milliseconds 
>> > > > > > > > > >>>>>>>>>  as
>> > > > in32.
>> > > > > > > > > Total
>> > > > > > > > > >>>>>> (64 bites)
>> > > > > > > > > >>>>>>>>>
>> > > > > > > > > >>>>>>>>> Since DF is using Arrow, It’s not possible to store
>> > > > “Complex”
>> > > > > > > > > >>>>>> intervals such 1 MONTH 1 DAY 1 HOUR.
>> > > > > > > > > >>>>>>>>> I think, the best way to understand the problem 
>> > > > > > > > > >>>>>>>>> will
>> > > > be to
>> > > > > > > read a
>> > > > > > > > > >>>>>> comment from DF codebase:
>> > > > > > > > > >>>>>>
>> > > > > > > > > >>>>>
>> > > > > > > > > >>
>> > > > > > > > >
>> > > > > > >
>> > > > https://github.com/apache/arrow/blob/bca7d2fe84ccd8fc1129cb4d85448eb0779c52c3/rust/datafusion/src/sql/planner.rs#L1148
>> > > > > > > > > >>>>>>>>>
>> > > > > > > > > >>>>>>>>>          // Interval is tricky thing
>> > > > > > > > > >>>>>>>>>          // 1 day is not 24 hours because 
>> > > > > > > > > >>>>>>>>> timezones, 1
>> > > > year
>> > > > > > > !=
>> > > > > > > > > >>>>> 365/364!
>> > > > > > > > > >>>>>> 30 days != 1 month
>> > > > > > > > > >>>>>>>>>          // The true way to store and calculate
>> > > > intervals is
>> > > > > > > to
>> > > > > > > > > >> store
>> > > > > > > > > >>>>>> it as it defined
>> > > > > > > > > >>>>>>>>>          // Due the fact that Arrow supports only 
>> > > > > > > > > >>>>>>>>> two
>> > > > types
>> > > > > > > > > >> YearMonth
>> > > > > > > > > >>>>>> (month) and DayTime (day, time)
>> > > > > > > > > >>>>>>>>>          // It's not possible to store complex
>> > > > intervals
>> > > > > > > > > >>>>>>>>>          // It's possible to do select (NOW() +
>> > > > INTERVAL '1
>> > > > > > > > > year') +
>> > > > > > > > > >>>>>> INTERVAL '1 day'; as workaround
>> > > > > > > > > >>>>>>>>>          if result_month != 0 && (result_days != 0 
>> > > > > > > > > >>>>>>>>> ||
>> > > > > > > > > result_millis
>> > > > > > > > > >> !=
>> > > > > > > > > >>>>>> 0) {
>> > > > > > > > > >>>>>>>>>              return
>> > > > > > > Err(DataFusionError::NotImplemented(format!(
>> > > > > > > > > >>>>>>>>>                  "DF does not support intervals 
>> > > > > > > > > >>>>>>>>> that
>> > > > have
>> > > > > > > both a
>> > > > > > > > > >>>>>> Year/Month part as well as Days/Hours/Mins/Seconds: 
>> > > > > > > > > >>>>>> {:?}.
>> > > > Hint:
>> > > > > > > try
>> > > > > > > > > >>>>>> breaking the interval into two parts, one with 
>> > > > > > > > > >>>>>> Year/Month
>> > > > and
>> > > > > > > the
>> > > > > > > > > >> other
>> > > > > > > > > >>>>>> with Days/Hours/Mins/Seconds - e.g. (NOW() + INTERVAL 
>> > > > > > > > > >>>>>> '1
>> > > > year')
>> > > > > > > +
>> > > > > > > > > >>>>> INTERVAL
>> > > > > > > > > >>>>>> '1 day'",
>> > > > > > > > > >>>>>>>>>                  value
>> > > > > > > > > >>>>>>>>>              )));
>> > > > > > > > > >>>>>>>>>          }
>> > > > > > > > > >>>>>>>>>
>> > > > > > > > > >>>>>>>>>
>> > > > > > > > > >>>>>>>>>
>> > > > > > > > > >>>>>>>>> I prepared a PR
>> > > > > > > https://github.com/apache/arrow/pull/9516/files
>> > > > > > > > > <
>> > > > > > > > > >>>>>> https://github.com/apache/arrow/pull/9516/files> <
>> > > > > > > > > >>>>>> https://github.com/apache/arrow/pull/9516/files <
>> > > > > > > > > >>>>>> https://github.com/apache/arrow/pull/9516/files>> that
>> > > > > > > introduce a
>> > > > > > > > > >> new
>> > > > > > > > > >>>>>> type for IntervalUnit called Complex, that store both
>> > > > YearMonth
>> > > > > > > and
>> > > > > > > > > >>>>> DayTime
>> > > > > > > > > >>>>>> to support complex interval.
>> > > > > > > > > >>>>>>>>> I didn’t find any page/documentation on how to do 
>> > > > > > > > > >>>>>>>>> RFC
>> > > > in
>> > > > > > > Arrow
>> > > > > > > > > >>>>>> protocol, so can anyone point me to it or PR with 
>> > > > > > > > > >>>>>> email
>> > > > will be
>> > > > > > > > > >> enough?
>> > > > > > > > > >>>>>>>>>
>> > > > > > > > > >>>>>>>>> Thanks.
>> > > > > > > > > >>>>>>>
>> > > > > > > > > >>>>>>
>> > > > > > > > > >>>>>
>> > > > > > > > > >>>>
>> > > > > > > > > >>
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > >
>> > > >

Reply via email to