Ah, that makes sense to wait then.
On Thu, May 6, 2021 at 10:55 AM Micah Kornfield <emkornfi...@gmail.com> wrote: > > I'll address the feedback. I think in the past we've waited for > implementations in java and c++ with integration tests before formally > voting. If there is no more feedback I can start looking at implementations > (happy to have help) > > On Thursday, May 6, 2021, Wes McKinney <wesmck...@gmail.com> wrote: >> >> The PR looks good. I just left some comments about typos. I would say >> it's probably about time to call a vote. Anywhere else where we should >> be soliciting feedback? >> >> On Mon, May 3, 2021 at 2:17 PM Jacek Pliszka <jacek.plis...@gmail.com> wrote: >> > >> > Good idea, I've created JIRA issue: >> > >> > https://issues.apache.org/jira/browse/ARROW-12637 >> > >> > And named it range to avoid confusion with intervals... >> > Though confusion will stay as it is called interval in Pandas and in >> > logic (Allen's interval algebra) >> > >> > BR, >> > >> > Jacek >> > >> > pon., 3 maj 2021 o 18:05 Micah Kornfield <emkornfi...@gmail.com> >> > napisał(a): >> > > >> > > Hi Jacek, >> > > This seems like reasonable functionality. I think the probably comes in >> > > two parts: >> > > 1. This might be a good candidate for a "Well Known"/Officially >> > > supported >> > > Extension type. I can think of a few different representations but I >> > > would >> > > guess something like Struct[start: T, struct: end]] with well defined >> > > extension metadata to define open/closed on start and end might be the >> > > best >> > > (we should probably spin this off into a separate discussion thread). >> > > 2. Adding the right computation Kernels to work with the type. >> > > >> > > Do you want to start a new thread or open up some JIRAs to track this >> > > work? >> > > >> > > Thanks, >> > > Micah >> > > >> > > On Mon, May 3, 2021 at 5:32 AM Jacek Pliszka <jacek.plis...@gmail.com> >> > > wrote: >> > > >> > > > Sorry, my mistake. >> > > > >> > > > You are right - I meant anchored intervals as in pandas - ones with >> > > > defined start and end - and I think many future users will make the >> > > > same mistake. >> > > > >> > > > I would love to be able to do fast overlap joins on arrow level. >> > > > >> > > > Best Regards, >> > > > >> > > > Jacek >> > > > >> > > > >> > > > >> > > > >> > > > niedz., 2 maj 2021 o 23:06 Wes McKinney <wesmck...@gmail.com> >> > > > napisał(a): >> > > > > >> > > > > I also don't understand the comment about closed / open / semi-open >> > > > > intervals. Perhaps there is a confusion, since "interval" as we mean >> > > > > it here is called a "time delta" in some other projects. An interval >> > > > > here does not refer to a time span with a distinct start and end >> > > > > point >> > > > > (I understand this might be confusing to a pandas user since pandas >> > > > > has an interval data type where each value is a tuple of arbitrary >> > > > > start/end). >> > > > > >> > > > > On Sun, May 2, 2021 at 3:46 PM Micah Kornfield >> > > > > <emkornfi...@gmail.com> >> > > > wrote: >> > > > > > >> > > > > > Hi Jacek, >> > > > > > I'm not sure I fully understand the proposal, could you elaborate >> > > > > > with >> > > > more >> > > > > > examples/details? For instance DAY_TIME isn't just a UINT64, it >> > > > actually >> > > > > > contains 2 seperate fields (days and milliseconds). >> > > > > > >> > > > > > In terms of closed vs half-open, in my limited understanding, that >> > > > > > is >> > > > more >> > > > > > a concern of functions using interval types rather than the type >> > > > itself. >> > > > > > For instance a quick search of postgres [1] docs only talks about >> > > > half-open >> > > > > > in relation to the "Overlaps" operator >> > > > > > >> > > > > > Thanks, >> > > > > > -Micah >> > > > > > >> > > > > > [1] https://www.postgresql.org/docs/9.1/functions-datetime.html >> > > > > > >> > > > > > >> > > > > > >> > > > > > On Sun, May 2, 2021 at 12:25 AM Jacek Pliszka >> > > > > > <jacek.plis...@gmail.com >> > > > > >> > > > > > wrote: >> > > > > > >> > > > > > > Hi! >> > > > > > > >> > > > > > > I wonder if it were possible to have generic interval with >> > > > > > > integers >> > > > of >> > > > > > > specified size just to have common base for interval arithmetic. >> > > > > > > >> > > > > > > Then user can convert their period to ordinals and use the >> > > > > > > arithmetic >> > > > > > > (joining, deoverlapping, common parts, explosion etc.). >> > > > > > > >> > > > > > > So YEAR_MONTH and DAY_TIME would be just special cases of >> > > > > > > INTERVAL_UINT32 and INTERVAL_UINT64 >> > > > > > > >> > > > > > > Also I believe it is worth to state whether there are only closed >> > > > > > > intervals or open/semi-open ones are allowed as well. >> > > > > > > >> > > > > > > I believe I am just one of many reinventing the wheel here and >> > > > writing >> > > > > > > own versions of the above. >> > > > > > > >> > > > > > > BR, >> > > > > > > >> > > > > > > Jacek >> > > > > > > >> > > > > > > >> > > > > > > pt., 2 kwi 2021 o 21:53 Micah Kornfield <emkornfi...@gmail.com> >> > > > > > > napisał(a): >> > > > > > > > >> > > > > > > > Andrew is the use-case you have simply postgres compatibility >> > > > > > > > or >> > > > is it >> > > > > > > more >> > > > > > > > extensive? >> > > > > > > > >> > > > > > > > One potential problem with combining Month and Day fields, is >> > > > > > > > that >> > > > the >> > > > > > > type >> > > > > > > > no longer has a defined sort order (the existing >> > > > > > > > Day-Millisecond >> > > > type >> > > > > > > > without assumptions, in particular because I don't think today >> > > > there is >> > > > > > > an >> > > > > > > > explicit constraint on the bounds for the millisecond >> > > > > > > > component). >> > > > > > > > >> > > > > > > > -Micah >> > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > On Wed, Mar 31, 2021 at 9:03 AM Antoine Pitrou >> > > > > > > > <anto...@python.org >> > > > > >> > > > > > > wrote: >> > > > > > > > >> > > > > > > > > >> > > > > > > > > Le 31/03/2021 à 17:55, Micah Kornfield a écrit : >> > > > > > > > > > Thanks for the feedback. A couple of points here and some >> > > > responses >> > > > > > > > > below. >> > > > > > > > > > >> > > > > > > > > > * One other question is whether the Nanoseconds should >> > > > actually be >> > > > > > > > > > configurable (i.e. use milliseconds or microseconds). I >> > > > > > > > > > would >> > > > lean >> > > > > > > > > towards >> > > > > > > > > > no. >> > > > > > > > > >> > > > > > > > > Same for me. >> > > > > > > > > >> > > > > > > > > > * I'm also still not 100% convinced we need this as a first >> > > > class >> > > > > > > type in >> > > > > > > > > > arrow or if we should be looking more closely at the Struct >> > > > (in the >> > > > > > > Arrow >> > > > > > > > > > sense) based implementation. In the future where >> > > > > > > > > > alternative >> > > > > > > encodings >> > > > > > > > > are >> > > > > > > > > > supported, this could allow for much smaller footprints for >> > > > this >> > > > > > > type. >> > > > > > > > > >> > > > > > > > > Having a "packed" first class type allows for better locality >> > > > when >> > > > > > > > > accessing data. It doesn't sound very likely that you'd >> > > > > > > > > access >> > > > only >> > > > > > > one >> > > > > > > > > component of the interval. >> > > > > > > > > >> > > > > > > > > But I have no idea how important this is, and temporal >> > > > > > > > > datetypes >> > > > are >> > > > > > > > > generally cumbersome to add support for (conversions, >> > > > > > > > > arithmetic, >> > > > > > > etc.), >> > > > > > > > > so it would be nice to avoid adding too many of them :-) >> > > > > > > > > >> > > > > > > > > Regards >> > > > > > > > > >> > > > > > > > > Antoine. >> > > > > > > > > >> > > > > > > > > >> > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > The 3 >> > > > > > > > > >> field implementation doesn't seem to have any way to >> > > > > > > > > >> represent >> > > > > > > integral >> > > > > > > > > >> days, so I am also not sure about that one. >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > Sorry this was an email gaffe. I intended Month (32 bit >> > > > > > > > > > int), >> > > > Day >> > > > > > > (32 >> > > > > > > > > bit >> > > > > > > > > > int), Nanosecond (64 bit int). >> > > > > > > > > > >> > > > > > > > > > OTOH I don't really understand the point of supporting "the >> > > > most >> > > > > > > > > >> reasonable ranges for Year, Month and Nanoseconds >> > > > independently". >> > > > > > > What >> > > > > > > > > >> does it bring to encode more than one month in the >> > > > > > > > > >> nanoseconds >> > > > > > > field? >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > I'm happy with simplicity. In the past there has been >> > > > > > > > > > some >> > > > > > > reference to >> > > > > > > > > > people wanting to store very large timestamps (fall out of >> > > > > > > Nanoseconds >> > > > > > > > > max >> > > > > > > > > > representable value) but we've concluded that this wasn't >> > > > something >> > > > > > > that >> > > > > > > > > we >> > > > > > > > > > wanted to really support. >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > On Wed, Mar 31, 2021 at 4:49 AM Antoine Pitrou < >> > > > anto...@python.org> >> > > > > > > > > wrote: >> > > > > > > > > > >> > > > > > > > > >> >> > > > > > > > > >> I would favour the following characteristics : >> > > > > > > > > >> - support for nanoseconds (especially as other Arrow >> > > > > > > > > >> temporal >> > > > types >> > > > > > > > > >> support it) >> > > > > > > > > >> - easy to handle (which excludes the ZetaSQL >> > > > > > > > > >> representtaion >> > > > IMHO) >> > > > > > > > > >> >> > > > > > > > > >> OTOH I don't really understand the point of supporting >> > > > > > > > > >> "the >> > > > most >> > > > > > > > > >> reasonable ranges for Year, Month and Nanoseconds >> > > > independently". >> > > > > > > What >> > > > > > > > > >> does it bring to encode more than one month in the >> > > > > > > > > >> nanoseconds >> > > > > > > field? >> > > > > > > > > >> You can already use the Duration type for that. >> > > > > > > > > >> >> > > > > > > > > >> Regards >> > > > > > > > > >> >> > > > > > > > > >> Antoine. >> > > > > > > > > >> >> > > > > > > > > >> >> > > > > > > > > >> Le 31/03/2021 à 05:48, Micah Kornfield a écrit : >> > > > > > > > > >>> To follow-up on this conversation I did some analysis on >> > > > interval >> > > > > > > > > types: >> > > > > > > > > >>> >> > > > > > > > > >>> >> > > > > > > > > >> >> > > > > > > > > >> > > > > > > >> > > > https://docs.google.com/document/d/1i1E_fdQ_xODZcAhsV11Pfq27O50k679OYHXFJpm9NS0/edit >> > > > > > > > > >> Please feel free to add more details/systems I missed. >> > > > > > > > > >>> >> > > > > > > > > >>> Given the disparate requirements of different systems I >> > > > think the >> > > > > > > > > >> following might make sense for official types (if there >> > > > > > > > > >> isn't >> > > > > > > > > consensus, I >> > > > > > > > > >> might try to contributation extension Array >> > > > > > > > > >> implementations >> > > > for >> > > > > > > them to >> > > > > > > > > >> Java and C++/Python separately). >> > > > > > > > > >>> >> > > > > > > > > >>> 1. 3 fields: Year (32 bit), Month (32 bit), Nanoseconds >> > > > > > > > > >>> (64 >> > > > bit) >> > > > > > > all >> > > > > > > > > >> signed. >> > > > > > > > > >>> 2. Postgres representation (Downside is it doesn't >> > > > > > > > > >>> support >> > > > > > > > > Nanoseconds, >> > > > > > > > > >> only microseconds). >> > > > > > > > > >>> 3. ZetaSQL implementation (Requires some bit >> > > > > > > > > >>> manipulation) >> > > > but >> > > > > > > > > supports >> > > > > > > > > >> the most reasonable ranges for Year, Month and Nanoseconds >> > > > > > > > > independently. >> > > > > > > > > >>> >> > > > > > > > > >>> Thoughts? >> > > > > > > > > >>> >> > > > > > > > > >>> Micah >> > > > > > > > > >>> >> > > > > > > > > >>> On 2021/02/18 04:30:55 Micah Kornfield wrote: >> > > > > > > > > >>>>> >> > > > > > > > > >>>>> I didn’t find any page/documentation on how to do RFC >> > > > > > > > > >>>>> in >> > > > Arrow >> > > > > > > > > >> protocol, >> > > > > > > > > >>>>> so can anyone point me to it or PR with email will be >> > > > enough? >> > > > > > > > > >>>> >> > > > > > > > > >>>> That is enough to start discussion. Before formal >> > > > acceptance and >> > > > > > > > > >> merging >> > > > > > > > > >>>> of the PR there needs to be a Java and C++ >> > > > > > > > > >>>> implementations >> > > > for the >> > > > > > > > > type >> > > > > > > > > >>>> that pass integration tests. At the time this >> > > > > > > > > >>>> guideline was >> > > > > > > > > instituted >> > > > > > > > > >>>> Java and C++ were considered the "reference" >> > > > implementations (I >> > > > > > > think >> > > > > > > > > >> they >> > > > > > > > > >>>> still have the most complete integration test coverage). >> > > > > > > > > >>>> >> > > > > > > > > >>>> My understanding is that the current modelling of >> > > > > > > > > >>>> intervals >> > > > > > > mimics SQL >> > > > > > > > > >>>> standards (e.g. SQL Server [1]). So it would also be >> > > > > > > > > >>>> good >> > > > to step >> > > > > > > > > back >> > > > > > > > > >> and >> > > > > > > > > >>>> understand what problem DF is trying to solve and how it >> > > > differs >> > > > > > > from >> > > > > > > > > >> other >> > > > > > > > > >>>> SQL implementations. I'd be hesitant to accept COMPLEX >> > > > > > > > > >>>> as >> > > > a new >> > > > > > > type >> > > > > > > > > >>>> without a much deeper analysis into calendar >> > > > > > > > > >>>> representations >> > > > > > > within >> > > > > > > > > >> Arrow >> > > > > > > > > >>>> and how they relate to other existing systems (e.g. Hive >> > > > and some >> > > > > > > > > >>>> assortment of existing SQL databases). For instance the >> > > > current >> > > > > > > > > >> modelling >> > > > > > > > > >>>> of timestamps does not lend itself to constructing a >> > > > > > > > > >>>> COMPLEX >> > > > > > > interval >> > > > > > > > > >> type >> > > > > > > > > >>>> particularly well. (Duration was introduced for this >> > > > reason). >> > > > > > > > > >>>> >> > > > > > > > > >>>> I think both Wes's suggestion of FixedSizeBinary and >> > > > Andrew's of >> > > > > > > > > >> composing >> > > > > > > > > >>>> the with a struct are good stop-gaps. These obviously >> > > > > > > > > >>>> have >> > > > > > > different >> > > > > > > > > >>>> trade-offs. Ultimately, it would be good to define >> > > > > > > > > >>>> common >> > > > > > > extension >> > > > > > > > > >> types >> > > > > > > > > >>>> that can represent this use-case if there really is >> > > > > > > > > >>>> demand >> > > > for it >> > > > > > > (if >> > > > > > > > > it >> > > > > > > > > >>>> doesn't become a top level type). >> > > > > > > > > >>>> >> > > > > > > > > >>>> [1] >> > > > > > > > > >>>> >> > > > > > > > > >> >> > > > > > > > > >> > > > > > > >> > > > https://docs.microsoft.com/en-us/sql/odbc/reference/appendixes/interval-data-types?view=sql-server-ver15 >> > > > > > > > > >>>> >> > > > > > > > > >>>> -Micah >> > > > > > > > > >>>> >> > > > > > > > > >>>> On Wed, Feb 17, 2021 at 2:05 PM Andrew Lamb < >> > > > al...@influxdata.com >> > > > > > > > >> > > > > > > > > >> wrote: >> > > > > > > > > >>>> >> > > > > > > > > >>>>> That is a great suggestion Wes, thank you. >> > > > > > > > > >>>>> >> > > > > > > > > >>>>> I wonder if we could get away with a 128 bit >> > > > representation that >> > > > > > > is >> > > > > > > > > the >> > > > > > > > > >>>>> concatenation of the two existing interval types >> > > > > > > > > (YearMonth)(DayTime). >> > > > > > > > > >> Or >> > > > > > > > > >>>>> maybe even define a `struct` type with those fields >> > > > > > > > > >>>>> that >> > > > is used >> > > > > > > by >> > > > > > > > > >>>>> DataFusion. >> > > > > > > > > >>>>> >> > > > > > > > > >>>>> Basically, given our reading of the Arrow spec[1], it >> > > > > > > > > >>>>> is >> > > > > > > currently >> > > > > > > > > not >> > > > > > > > > >>>>> possible to precisely represent an interval that has >> > > > > > > > > >>>>> both >> > > > > > > monthly and >> > > > > > > > > >>>>> sub-montly granularity. >> > > > > > > > > >>>>> >> > > > > > > > > >>>>> As Dmtry says, if you have an interval seemingly simple >> > > > like 1 >> > > > > > > > > month, >> > > > > > > > > >> 1 >> > > > > > > > > >>>>> day >> > > > > > > > > >>>>> >> > > > > > > > > >>>>> Using IntervalUnit(YEAR_MONTH) can't represent the 1 >> > > > > > > > > >>>>> day >> > > > > > > > > >>>>> Using IntervalUnit(DAY_TIME) can't represent the month >> > > > > > > > > >>>>> as >> > > > > > > different >> > > > > > > > > >> months >> > > > > > > > > >>>>> have different numbers of days >> > > > > > > > > >>>>> >> > > > > > > > > >>>>> [1] >> > > > > > > > > >>>>> >> > > > > > > > > >> >> > > > > > > >> > > > https://github.com/apache/arrow/blob/master/format/Schema.fbs#L249-L260 >> > > > > > > > > >>>>> >> > > > > > > > > >>>>> >> > > > > > > > > >>>>> On Wed, Feb 17, 2021 at 5:01 PM Wes McKinney < >> > > > > > > wesmck...@gmail.com> >> > > > > > > > > >> wrote: >> > > > > > > > > >>>>> >> > > > > > > > > >>>>>> On Wed, Feb 17, 2021 at 3:46 PM <t...@dmtry.me> wrote: >> > > > > > > > > >>>>>>> >> > > > > > > > > >>>>>>>> It's unclear to me that this needs to be introduced >> > > > into the >> > > > > > > > > >>>>> top-level >> > > > > > > > > >>>>>>> >> > > > > > > > > >>>>>>> Similar thing to columnar format, How to store >> > > > > > > > > >>>>>>> interval >> > > > like 1 >> > > > > > > > > month >> > > > > > > > > >> 1 >> > > > > > > > > >>>>>> day 1 hour? It’s not possible to do it without >> > > > > > > > > >>>>>> converting >> > > > 1 >> > > > > > > month to >> > > > > > > > > >> 30 >> > > > > > > > > >>>>>> days, which is a bad way. >> > > > > > > > > >>>>>>> >> > > > > > > > > >>>>>> >> > > > > > > > > >>>>>> Presumably you can represent a complex interval in a >> > > > > > > > > >>>>>> fixed >> > > > > > > number of >> > > > > > > > > >>>>>> bytes, and then embed the data in a FixedSizeBinary >> > > > > > > > > >>>>>> type. >> > > > You >> > > > > > > can >> > > > > > > > > >>>>>> adorn this type with extension type metadata so that >> > > > DataFusion >> > > > > > > can >> > > > > > > > > >>>>>> then apply Interval semantics to it. This could also >> > > > serve as an >> > > > > > > > > >>>>>> interim strategy for you to proceed with >> > > > > > > > > >>>>>> implementation >> > > > while >> > > > > > > > > >>>>>> proposing a top-level type to the Arrow format (which >> > > > > > > > > >>>>>> may >> > > > or >> > > > > > > may not >> > > > > > > > > >>>>>> be accepting) so you aren't blocked on acceptance of >> > > > changes >> > > > > > > into >> > > > > > > > > >>>>>> Schema.fbs. >> > > > > > > > > >>>>>> >> > > > > > > > > >>>>>>>> On 17 Feb 2021, at 21:02, Wes McKinney < >> > > > wesmck...@gmail.com> >> > > > > > > > > wrote: >> > > > > > > > > >>>>>>>> >> > > > > > > > > >>>>>>>> It's unclear to me that this needs to be introduced >> > > > into the >> > > > > > > > > >>>>> top-level >> > > > > > > > > >>>>>>>> columnar format without more analysis — have you >> > > > considered >> > > > > > > > > >>>>>>>> implementing this for DataFusion as an extension >> > > > > > > > > >>>>>>>> type >> > > > for the >> > > > > > > time >> > > > > > > > > >>>>>>>> being? >> > > > > > > > > >>>>>>>> >> > > > > > > > > >>>>>>>> On Wed, Feb 17, 2021 at 11:59 AM t...@dmtry.me >> > > > > > > > > >>>>>>>> <mailto: >> > > > > > > > > >> t...@dmtry.me >> > > > > > > > > >>>>>> >> > > > > > > > > >>>>>> <t...@dmtry.me <mailto:t...@dmtry.me>> wrote: >> > > > > > > > > >>>>>>>>> >> > > > > > > > > >>>>>>>>> Hi, >> > > > > > > > > >>>>>>>>> >> > > > > > > > > >>>>>>>>> For now, There are only two types of IntervalUnit >> > > > inside >> > > > > > > Arrow: >> > > > > > > > > >>>>>>>>> >> > > > > > > > > >>>>>>>>> - YearMonth - month stored as int32 >> > > > > > > > > >>>>>>>>> - DayTime - days as int32 and time in milliseconds >> > > > > > > > > >>>>>>>>> as >> > > > in32. >> > > > > > > > > Total >> > > > > > > > > >>>>>> (64 bites) >> > > > > > > > > >>>>>>>>> >> > > > > > > > > >>>>>>>>> Since DF is using Arrow, It’s not possible to store >> > > > “Complex” >> > > > > > > > > >>>>>> intervals such 1 MONTH 1 DAY 1 HOUR. >> > > > > > > > > >>>>>>>>> I think, the best way to understand the problem >> > > > > > > > > >>>>>>>>> will >> > > > be to >> > > > > > > read a >> > > > > > > > > >>>>>> comment from DF codebase: >> > > > > > > > > >>>>>> >> > > > > > > > > >>>>> >> > > > > > > > > >> >> > > > > > > > > >> > > > > > > >> > > > https://github.com/apache/arrow/blob/bca7d2fe84ccd8fc1129cb4d85448eb0779c52c3/rust/datafusion/src/sql/planner.rs#L1148 >> > > > > > > > > >>>>>>>>> >> > > > > > > > > >>>>>>>>> // Interval is tricky thing >> > > > > > > > > >>>>>>>>> // 1 day is not 24 hours because >> > > > > > > > > >>>>>>>>> timezones, 1 >> > > > year >> > > > > > > != >> > > > > > > > > >>>>> 365/364! >> > > > > > > > > >>>>>> 30 days != 1 month >> > > > > > > > > >>>>>>>>> // The true way to store and calculate >> > > > intervals is >> > > > > > > to >> > > > > > > > > >> store >> > > > > > > > > >>>>>> it as it defined >> > > > > > > > > >>>>>>>>> // Due the fact that Arrow supports only >> > > > > > > > > >>>>>>>>> two >> > > > types >> > > > > > > > > >> YearMonth >> > > > > > > > > >>>>>> (month) and DayTime (day, time) >> > > > > > > > > >>>>>>>>> // It's not possible to store complex >> > > > intervals >> > > > > > > > > >>>>>>>>> // It's possible to do select (NOW() + >> > > > INTERVAL '1 >> > > > > > > > > year') + >> > > > > > > > > >>>>>> INTERVAL '1 day'; as workaround >> > > > > > > > > >>>>>>>>> if result_month != 0 && (result_days != 0 >> > > > > > > > > >>>>>>>>> || >> > > > > > > > > result_millis >> > > > > > > > > >> != >> > > > > > > > > >>>>>> 0) { >> > > > > > > > > >>>>>>>>> return >> > > > > > > Err(DataFusionError::NotImplemented(format!( >> > > > > > > > > >>>>>>>>> "DF does not support intervals >> > > > > > > > > >>>>>>>>> that >> > > > have >> > > > > > > both a >> > > > > > > > > >>>>>> Year/Month part as well as Days/Hours/Mins/Seconds: >> > > > > > > > > >>>>>> {:?}. >> > > > Hint: >> > > > > > > try >> > > > > > > > > >>>>>> breaking the interval into two parts, one with >> > > > > > > > > >>>>>> Year/Month >> > > > and >> > > > > > > the >> > > > > > > > > >> other >> > > > > > > > > >>>>>> with Days/Hours/Mins/Seconds - e.g. (NOW() + INTERVAL >> > > > > > > > > >>>>>> '1 >> > > > year') >> > > > > > > + >> > > > > > > > > >>>>> INTERVAL >> > > > > > > > > >>>>>> '1 day'", >> > > > > > > > > >>>>>>>>> value >> > > > > > > > > >>>>>>>>> ))); >> > > > > > > > > >>>>>>>>> } >> > > > > > > > > >>>>>>>>> >> > > > > > > > > >>>>>>>>> >> > > > > > > > > >>>>>>>>> >> > > > > > > > > >>>>>>>>> I prepared a PR >> > > > > > > https://github.com/apache/arrow/pull/9516/files >> > > > > > > > > < >> > > > > > > > > >>>>>> https://github.com/apache/arrow/pull/9516/files> < >> > > > > > > > > >>>>>> https://github.com/apache/arrow/pull/9516/files < >> > > > > > > > > >>>>>> https://github.com/apache/arrow/pull/9516/files>> that >> > > > > > > introduce a >> > > > > > > > > >> new >> > > > > > > > > >>>>>> type for IntervalUnit called Complex, that store both >> > > > YearMonth >> > > > > > > and >> > > > > > > > > >>>>> DayTime >> > > > > > > > > >>>>>> to support complex interval. >> > > > > > > > > >>>>>>>>> I didn’t find any page/documentation on how to do >> > > > > > > > > >>>>>>>>> RFC >> > > > in >> > > > > > > Arrow >> > > > > > > > > >>>>>> protocol, so can anyone point me to it or PR with >> > > > > > > > > >>>>>> email >> > > > will be >> > > > > > > > > >> enough? >> > > > > > > > > >>>>>>>>> >> > > > > > > > > >>>>>>>>> Thanks. >> > > > > > > > > >>>>>>> >> > > > > > > > > >>>>>> >> > > > > > > > > >>>>> >> > > > > > > > > >>>> >> > > > > > > > > >> >> > > > > > > > > > >> > > > > > > > > >> > > > > > > >> > > >