Personally I'd love to see a cookbook where a recipe is accompanied by
examples of how to accomplish it in multiple languages rather than having
separate cookbooks for each language.
Though that may just be me wanting to see more love for the Golang
implementation
On Wed, Jul 7, 2021, 8:57 PM
Thanks to all who attended. Meeting notes:
Attendees:
Nate Bauernfeind
Ian Cook
Nic Crane
Alenka Frim
Micah Kornfield
Jorge Leitao
Alessandro Molina
Weston Pace
Eduardo Ponce
Pol Santamaria
Discussion:
- Arrow 5.0.0 release
- Goal: release end of next week or worst case the third week of July
Hi Arrow devs,
Since 2017, it has been possible to install the Arrow C++ library
using the vcpkg package manager[1], but until recently, the Arrow
vcpkg port ("port" is their term for a package) was maintained by
community members, not by core Arrow devs. This led to a pattern of
irregular updates
Good question. I'll take a stab at answering some of it.
C++ has the same passthru / interoperability concerns. Python is
significant as it's builtin datetime module distinguishes between
"local" and "instant" datetimes (which it calls naive and non-naive).
In addition, pandas which has a very s
Thanks everyone for their input;
Interoperability would be the biggest issue; how much does C++ do with the
timezone string?
-Evan
> On Jul 7, 2021, at 1:33 PM, Weston Pace wrote:
>
> I don't know about removal but you could probably ignore the timezone
> string and it's not clear the issues
Here is additional food for thought.
The cookbook currently contains examples for C++, R, and Python. Is there a
plan (or wish) to eventually extend a single cookbook to include examples
from other languages (eg., Rust, Java)?
If so, then putting the cookbook into its own (language agnostic) repo
w
Great work!
I would recommend having the cookbook in its own repo so that its updates
are not constrained by the timeline used for updating the public Arrow
documentation.
This will allow users that are not involved in Arrow development to
contribute or provide suggestions to the cookbook fairly ea
Awesome! We would find C++ versions of these recipes very useful. From our
experience the C++ API is much much harder to deal with and error prone
than the R/Python one.
Cheers,
Rares
On Wed, Jul 7, 2021 at 9:07 AM Alessandro Molina <
alessan...@ursacomputing.com> wrote:
> Yes, that was mostly w
I have created a PR for the changes we discussed.
https://github.com/apache/arrow/pull/10679
It would be great if you guys could go through it. I'm still benchmarking
the results. And I also have some ideas to reduce the branches in the main
while loop in the Primitive Types implementation. I will
I don't know about removal but you could probably ignore the timezone
string and it's not clear the issues would be that significant.
If Rust never produces a non-null non-UTC timestamp then I don't see
that as an issue.
If you are consuming data with a timestamp string other than UTC it
isn't re
To summarize so far, it sounds like schema evolution is neither sufficient nor
necessary for either Gosh or Nate's use-cases here? It could be useful for
FlightSQL but even there I don't think it's a requirement.
For Nate - it almost sounds like what you need is some way to slice up a record
ba
> Flatbuffers does not support modifying structs
> in any forwards or backwards compatible way
> (only tables support evolution).
Bah. I did not realize that.
To reiterate the feature that would be ideal:
I realize the specific feature I am missing is the ability to encode that a
field (i.e. its
On Wed, 7 Jul 2021 at 18:46, Jorge Cardoso Leitão
wrote:
> Hi,
>
> AFAIK timezone is part of the spec.
And for reference, the current spec (Schema flatbuffer file) for timestamp
is at
https://github.com/apache/arrow/blob/6c8d30ea8fd2750b999840872d3f6cbdc8f8/format/Schema.fbs#L217-L247.
>
Hi,
AFAIK timezone is part of the spec. In Python, that would be [1]
import pyarrow as pa
dt1 = pa.timestamp("ms", "+00:10")
dt2 = pa.timestamp("ms")
arrow-rs is not very consistent with how it handles it. imo that is an
artifact of being currently difficult (API wise) to create an array with a
Hi folks,
Some of us are having a discussion about a direction change for Rust Arrow
timestamp types, which current support both a resolution field (Ns, Micros, Ms,
Seconds) similar to the other language implementations, but also optionally a
timezone string field. I believe the timezone fiel
Yes, that was mostly what I meant when I wrote that the next step is
opening a PR against the apache/arrow repository itself :D
We moved forward in a separate repository initially to be able to cycle
more quickly, but we reached a point where we think we can start
integrating the cookbook with the
Is this still happening today?
On Tue, Jul 6, 2021 at 11:07 AM Ian Cook wrote:
> Hi all,
>
> Our biweekly sync call is tomorrow at
> https://meet.google.com/vtm-teks-phx. All are welcome to join. Notes
> will be shared with the mailing list afterward.
>
> Ian
>
--
Update: For the meeting starting now, please us this Google Meet URL:
https://meet.google.com/ebp-tczo-xjn
Ian
On Tue, Jul 6, 2021 at 12:07 PM Ian Cook wrote:
>
> Hi all,
>
> Our biweekly sync call is tomorrow at
> https://meet.google.com/vtm-teks-phx. All are welcome to join. Notes
> will be sh
>
> Might there be interest in adding a "field_id" to the FieldNode (which is
> encoded on the RecordBatch flatbuffer)? I see a simple forward-compatible
> upgrade (by either keying off of 0, or explicitly set the field default to
> -1) which would allow the sender to "skip" fields that have 1) Fie
What do you think about developing this cookbook in an Apache Arrow
repository (it could be something like apache/arrow-cookbook, if not
part of the main development repo)? Creating expanded documentation
resources for learning how to use Apache Arrow to solve problems seems
certainly within the bo
We finally have a first preview of the cookbook available for R and Python,
for anyone interested the two versions are visible at
http://ursacomputing.com/arrow-cookbook/py/index.html and
http://ursacomputing.com/arrow-cookbook/r/index.html
A new version of the cookbook is automatically published o
Retitling and forking the discussion to talk about key value pairs.
What is the byte cost of an empty list? Another option would be to
introduce a new BinaryKeyValue table and add binary metadata.
On Wed, Jul 7, 2021 at 8:32 AM Nate Bauernfeind <
natebauernfe...@deephaven.io> wrote:
> Deephaven
Deephaven and I are very supportive of "upgrading" the value half of the kv
pair to a byte vector. What is the best way to find out if there is
sufficient interest?
I've been stewing on the ideas here around schema evolution, and I realize
the specific feature I am missing is the ability to encod
On Wed, Jul 7, 2021 at 2:53 PM David Li wrote:
>
> From the Flatbuffers internals doc[1] it appears they are the same: "Strings
> are simply a vector of bytes, and are always null-terminated."
I see. I took a look at flatbuffers.h, and it appears that changing
this field from string to [byte] wo
>From the Flatbuffers internals doc[1] it appears they are the same: "Strings
>are simply a vector of bytes, and are always null-terminated."
[1]: https://google.github.io/flatbuffers/flatbuffers_internals.html
-David
On Wed, Jul 7, 2021, at 05:08, Wes McKinney wrote:
> On Tue, Jul 6, 2021 at 6
I investigated the cpython approach and the PR labelling is a part of
the existing bedevere bot which does a number of things (not all
relevant to Arrow). Yesterday I created a standalone Github action[1]
dedicated to this task roughly based on my previous email. It will
apply "awaiting-review" a
On Tue, Jul 6, 2021 at 6:33 PM Micah Kornfield wrote:
>
> >
> > Right, I had wanted to focus the discussion on Flight as I think schema
> > evolution or multiplexing streams (more so the latter) is a property of the
> > transport and not the stream format itself. If we are leaning towards just
> >
27 matches
Mail list logo