Re: [C++] Async Arrow Flight

2021-06-03 Thread David Li
I see, thanks. That sounds reasonable, and indeed if you are subscribing to lots of data sources (instead of just trying to maximize throughput as I think most people have done before) async may be helpful. There are actually two features being described here though: 1. The C++ Flight client n

Re: [Format] Timestamp timezone semantics?

2021-06-03 Thread Wes McKinney
Arrow's decision was not to permit storage of timestamps with "localized" representation (which is distinct from UTC internal representation with a different time zone set). The problem really comes down to the interpretation of "time zone naive" timestamps on different systems: operations in my op

Re: [Format] Timestamp timezone semantics?

2021-06-03 Thread Julian Hyde
It seems that Arrow’s timestamp type can either have no time zone or be UTC. I think that is a flawed design, because doesn’t catch user errors. Suppose you want to find the number of milliseconds between two timestamps. If the first has a timezone and the second is implicitly UTC, then you can

Re: [Format] Timestamp timezone semantics?

2021-06-03 Thread Adam Hooper
On Thu, Jun 3, 2021 at 2:02 PM Adam Hooper wrote: > I understand isAdjustedToUTC=true to mean "timestamp", and > isAdjustedToUTC=false to mean, "int64 and I hope somebody attached some > docs because > https://github.com/apache/parquet-format/blob/master/LogicalTypes.md#local-semantics-timestamps

Re: [Format] Timestamp timezone semantics?

2021-06-03 Thread Adam Hooper
On Thu, Jun 3, 2021 at 1:17 PM Jorge Cardoso Leitão < jorgecarlei...@gmail.com> wrote: > That is my understanding as well, a timestamp either has a timezone or it > has not. If it does not have a timezone, it should be presented as is and > no assumptions can be made about its timezone. In particu

Re: [Format] Timestamp timezone semantics?

2021-06-03 Thread Jorge Cardoso Leitão
That is my understanding as well, a timestamp either has a timezone or it has not. If it does not have a timezone, it should be presented as is and no assumptions can be made about its timezone. In particular, but given two fields X and Y, one with a timezone and another without, e.g. it is not mea

Re: [Format] Timestamp timezone semantics?

2021-06-03 Thread Julian Hyde
My answer to Antoine’s question would not be “kind of”, it would be “no”. In a system such as Joda-time, which I claim is the only system that Arrow should be considering, a timestamp-without-timezone does not have an implicit time zone of UTC. It has no time zone. > On Jun 3, 2021, at 8:52 AM

Re: [C++] Async Arrow Flight

2021-06-03 Thread Nate Bauernfeind
In addition to Arrow Flight we have other gRPC APIs that work together as a whole. For example, the API client establishes a session with the server. Then the client tells the server to create a derivative data stream by filtering/joining/computing/etc several source streams. The server will keep t

Re: [Format] Timestamp timezone semantics?

2021-06-03 Thread Micah Kornfield
> > Aren't those exactly the same (i.e. no timezone implicitly means UTC, > not local time)? Kind of, the reason we went with this approach is this sentence from the specification: "the data is "time zone naive" and shall be displayed *as is* to the user, not localized to the locale of the user.

Re: [Format] Timestamp timezone semantics?

2021-06-03 Thread Hongze Zhang
On Wed, 2021-06-02 at 13:56 -0700, Micah Kornfield wrote: > > > > Any SQL interface to Arrow should follow the SQL standard. So, for > > instance, if a column has TIMESTAMP type, it should behave as a > > date-time without a time-zone. > > > At least in bigquery we do the following mapping: > SQ

Re: C++ Migrate from Arrow 0.16.0

2021-06-03 Thread Antoine Pitrou
It should work as long as the union itself doesn't have nulls: https://github.com/apache/arrow/blob/master/cpp/src/arrow/ipc/reader.cc#L351 Regards Antoine. Le 03/06/2021 à 06:42, Micah Kornfield a écrit : I think the one place where it might break is for Union types (I seem to recall a bre

Re: [Format] Timestamp timezone semantics?

2021-06-03 Thread Antoine Pitrou
Le 02/06/2021 à 22:56, Micah Kornfield a écrit : Any SQL interface to Arrow should follow the SQL standard. So, for instance, if a column has TIMESTAMP type, it should behave as a date-time without a time-zone. At least in bigquery we do the following mapping: SQL TIMESTAMP -> Arrow Timesta