Re: [DISCUSS][Format] Time Interval Changes

2019-04-03 Thread Micah Kornfield
Sgtm, I think a PMC member needs to kick it off? On Wednesday, April 3, 2019, Wes McKinney wrote: > Agreed > > On Wed, Apr 3, 2019 at 9:53 AM Jacques Nadeau wrote: > > > > Option 1 sounds good to me. Let's take to a vote. > > > > On Tue, Apr 2, 2019 at 8:53 PM Micah Kornfield > wrote: > >> > >

Re: [DISCUSS][Format] Time Interval Changes

2019-04-03 Thread Wes McKinney
Agreed On Wed, Apr 3, 2019 at 9:53 AM Jacques Nadeau wrote: > > Option 1 sounds good to me. Let's take to a vote. > > On Tue, Apr 2, 2019 at 8:53 PM Micah Kornfield wrote: >> >> Based on the discussion so far, my attempt at concrete Schema proposals >> below.Jacques I think summarizes what w

Re: [DISCUSS][Format] Time Interval Changes

2019-04-03 Thread Jacques Nadeau
Option 1 sounds good to me. Let's take to a vote. On Tue, Apr 2, 2019 at 8:53 PM Micah Kornfield wrote: > Based on the discussion so far, my attempt at concrete Schema proposals > below.Jacques I think summarizes what we've discussed, apologies if > I've misunderstood. Wes would Option 1 wo

Re: [DISCUSS][Format] Time Interval Changes

2019-04-02 Thread Micah Kornfield
Based on the discussion so far, my attempt at concrete Schema proposals below.Jacques I think summarizes what we've discussed, apologies if I've misunderstood. Wes would Option 1 work to support the Pandas Time Delta use-case? I'm leaning towards Option 1 if it satisfies everyone (but happy t

Re: [DISCUSS][Format] Time Interval Changes

2019-04-02 Thread Wes McKinney
Since there were some mentions of leap seconds: I think the intent of the timedelta/duration type should be to express the difference between UNIX timestamps (from second to nanosecond resolution), which don't include leap seconds. We use the timedelta64[ns] type in pandas for example, which is a

Re: [DISCUSS][Format] Time Interval Changes

2019-04-02 Thread Jacques Nadeau
> > I could go either way, it has some benefits for forward compatibility I > suppose, but on the other hand YAGNI, if you feel strongly, I'm ok > including it. However, the more optional fields we have for a specific > enum value, makes me lean more towards a new type instead of just an enum. > I

Re: [DISCUSS][Format] Time Interval Changes

2019-04-01 Thread Micah Kornfield
On Mon, Apr 1, 2019 at 4:17 PM Jacques Nadeau wrote: > >> >> I don't think we should include byte-width unless we have a concrete >> use-case (it can be added later, using 8 Bytes as the default if not set). >> > I'm okay with only allowing one today. I wonder whether we should declare > it now a

Re: [DISCUSS][Format] Time Interval Changes

2019-04-01 Thread Jacques Nadeau
> > > > I don't think we should include byte-width unless we have a concrete > use-case (it can be added later, using 8 Bytes as the default if not set). > I'm okay with only allowing one today. I wonder whether we should declare it now and only allow 8? > > Comment below on equivalences, is that

Re: [DISCUSS][Format] Time Interval Changes

2019-04-01 Thread Micah Kornfield
Sorry sent this too early. TL;DR; I'm in favor of moving forward with this declaration: table Interval { unit: IntervalUnit; timeUnit: TimeUnit; // defined when using duration } I don't think we should include byte-width unless we have a concrete use-case (it can be added later, using 8 Bytes

Re: [DISCUSS][Format] Time Interval Changes

2019-04-01 Thread Micah Kornfield
TL;DR; I'm in favor of moving forward with this declaration: On Mon, Apr 1, 2019 at 11:38 AM Jacques Nadeau wrote: > I'm sorry, I've been busy with several other things. > > A question, what about this alternative? > enum IntervalUnit: short { YEAR_MONTH, DAY_TIME, DURATION } > table Interva

Re: [DISCUSS][Format] Time Interval Changes

2019-04-01 Thread Jacques Nadeau
I'm sorry, I've been busy with several other things. A question, what about this alternative? enum IntervalUnit: short { YEAR_MONTH, DAY_TIME, DURATION } table Interval { unit: IntervalUnit; timeUnit: TimeUnit; // defined when using duration byteWidth: short; // defined when using duration

Re: [DISCUSS][Format] Time Interval Changes

2019-04-01 Thread Wes McKinney
I would like to propose a vote on this feature this week. Could someone from the Java side weigh in since there is some existing code relating to intervals there already? On Wed, Mar 27, 2019 at 10:49 PM Micah Kornfield wrote: > > Hi Wes, > Thanks for the feedback. I'm happy to update the PR to

Re: [DISCUSS][Format] Time Interval Changes

2019-03-27 Thread Micah Kornfield
Hi Wes, Thanks for the feedback. I'm happy to update the PR to include c++ and python once there is consensus on the format change. I'd also welcome feedback and an extra set of eyes on the issues I raised below, since it is hard to change once we make a release. Based on previous discussions, I

Re: [DISCUSS][Format] Time Interval Changes

2019-03-27 Thread Wes McKinney
hi Micah, Sorry for the delay. I'm in favor of introducing the Duration/DurationInterval type to unblock the difference-of-timestamps / timedelta use case that many Arrow users have. I'd like Jacques or someone from the Java side to comment about this before starting a vote. We can merge these c

Re: [DISCUSS][Format] Time Interval Changes

2019-03-22 Thread Micah Kornfield
Hi arrow-dev, I just wanted to bump this thread to see if anyone wanted to comment or discuss a path forward. If no one chimes in by Monday evening, could I ask a PMC member to start a vote on Tuesday (I believe a member of the PMC needs to initiate a vote?) I will implement the C++ side once the