> Can we name it miniarrow or nanoarrow?

I'm happy to call it something else! Probably nanoarrow if I get to pick
because of the parallel with nanopb/nanodbc.

On Thu, Jun 16, 2022 at 6:26 AM Antoine Pitrou <anto...@python.org> wrote:

>
> Can we name it miniarrow or nanoarrow? We don't want to convey the
> message that there is a parallel C API for Arrow.
>
>
> Le 15/06/2022 à 05:18, Dewey Dunnington a écrit :
> > Hi all,
> >
> > I drafted a second PR [1] drafting a design for storing parsed
> information
> > obtained from a struct ArrowSchema (i.e., parsing the format string into
> > usable C structures). There are some unsolved problems that could use a
> > fresh perspective...all comments welcome!
> >
> > [1] https://github.com/paleolimbot/arrow-c/pull/5
> >
> > On Fri, Jun 10, 2022 at 12:27 PM Dewey Dunnington <de...@voltrondata.com
> >
> > wrote:
> >
> >> Hi all,
> >>
> >> As promised, I converted the design document [1] into an initial PR [2].
> >> Rather than draft the whole header, I started with README +
> implementations
> >> + testing for error handling and schema allocation (depending on
> feedback,
> >> next week I will draft another reviewable chunk).
> >>
> >> Also feel free to suggest another place to put this if one exists (the
> >> choice to put it in its own repo was based on informal feedback that
> >> perhaps that might be the best way to go).
> >>
> >> [1]
> >>
> https://docs.google.com/document/d/11n7ICVZO8exZ-z3GRlI26VLzKPXlYlEz5xjLl1y0ujU/edit?usp=sharing
> >> [2] https://github.com/paleolimbot/arrow-c/pull/1/files
> >>
> >> On Fri, Jun 3, 2022 at 12:41 PM Dewey Dunnington <de...@voltrondata.com
> >
> >> wrote:
> >>
> >>> Hi all,
> >>>
> >>> Based on the points raised above and a few adventures implementing some
> >>> of this in related projects, I put together a brief design document
> >>> proposing a scope and structure to perhaps solidify a few of these
> >>> discussions:
> >>>
> https://docs.google.com/document/d/11n7ICVZO8exZ-z3GRlI26VLzKPXlYlEz5xjLl1y0ujU/edit?usp=sharing
> >>> .
> >>>
> >>> Any and all should feel free to add, rewrite, or propose a new
> >>> structure...I wrote many of the pieces for argument's sake or because
> >>> that's how I'd implemented them before.
> >>>
> >>> Next week I will phrase it as a skeleton header (like the one in the
> >>> excellent ADBC design discussions) depending on feedback to keep the
> >>> discussion going!
> >>>
> >>> Cheers,
> >>>
> >>> -dewey
> >>>
> >>> On Fri, Jun 3, 2022 at 9:57 AM Hannes Mühleisen <han...@duckdblabs.com
> >
> >>> wrote:
> >>>
> >>>> Hello List,
> >>>>
> >>>> we at DuckDB are happy users of the Arrow C Data Interface and use it
> to
> >>>> feed SQL queries and also use it to provide query results in Arrow
> format
> >>>> again. It is particularly appealing to us that the interface is
> merely a
> >>>> (C) header file that we just ship with our source code [1].
> Internally,
> >>>> our
> >>>> implementation then constructs DuckDB internal vectors from the Arrow
> >>>> format [2] or vice-versa [3].
> >>>>
> >>>> As you can see from [2, 3] there is some complexity in getting the
> >>>> conversion right, especially for more complex data types like nested
> >>>> types
> >>>> (list, strings). A lightweight, dependency-free library to help
> >>>> constructing those would certainly be appreciated. What would also
> help a
> >>>> lot is validation code, Arrow structures are very delicate and one
> wrong
> >>>> pointer can lead to disaster (which is then blamed on us), so a way to
> >>>> verify the structures in said lightweight library would be very
> helpful.
> >>>>
> >>>> Best from Amsterdam, and Quack
> >>>>
> >>>> Hannes
> >>>>
> >>>> [1]
> >>>>
> >>>>
> https://github.com/duckdb/duckdb/blob/master/src/include/duckdb/common/arrow.hpp
> >>>> [2]
> >>>>
> https://github.com/duckdb/duckdb/blob/master/src/function/table/arrow.cpp
> >>>> [3]
> >>>>
> >>>>
> https://github.com/duckdb/duckdb/blob/master/src/common/types/data_chunk.cpp
> >>>>
> >>>>
> >>>> On Fri, Jun 03, 2022 at 15:34:42, Jonathan Keane <jke...@gmail.com>
> >>>> wrote:
> >>>>
> >>>>> cc Hannes Mühleisen from DuckDB Labs
> >>>>>
> >>>>> -Jon
> >>>>>
> >>>>>
> >>>>> On Tue, May 31, 2022 at 5:03 PM Wes McKinney <wesmck...@gmail.com>
> >>>> wrote:
> >>>>>
> >>>>> I'm also supportive of having a small vendorable C/C++ "Arrow
> >>>>> middleware" that provides:
> >>>>>
> >>>>> * Schemas and types
> >>>>> * Columnar data structures and minimal APIs to build them and iterate
> >>>> over
> >>>>> them
> >>>>> * C data interface
> >>>>> * Minimal validation (at the level of Validate but not ValidateFull)
> >>>>>
> >>>>> I don't think it's going to be practical to try to refactor parts of
> >>>>> the existing Arrow C++ core to be vendorable since there are many
> >>>>> features / requirements (e.g. an extensible buffer and device API)
> >>>>> that these C++ classes include that aren't needed in this
> >>>>> limited-feature middleware library.
> >>>>>
> >>>>> This also relates to the "Improving Arrow's database support" project
> >>>>> that David Li raised some time ago [1]. If we want to encourage
> >>>>> database driver libraries to add new APIs that emit the Arrow C
> >>>>> interface, we need to make it easier to generate the C interface
> >>>>> without requiring a new library dependency.
> >>>>>
> >>>>> [1]:
> https://lists.apache.org/thread/gnz1kz2rj3rb8rh8qz7l0mv8lvzq254w
> >>>>>
> >>>>> On Mon, May 30, 2022 at 11:31 AM Jonathan Keane <jke...@gmail.com>
> >>>> wrote:
> >>>>>>
> >>>>>> Thanks for working on this. I've heard people asking about something
> >>>>>> like this from a number of different fronts on top of the obvious
> use
> >>>>>> case in geoarrow | other geospatial libraries. I think a minimal
> >>>> piece
> >>>>>> of Arrow that other packages could depend on without needing to
> bring
> >>>>>> in all of arrow would be super valuable in building the bridges we
> >>>>>> want across other systems.
> >>>>>>
> >>>>>> Do you have any (design) documentation that describes the scope of
> >>>>>> what you're thinking? I know there have been others floating around
> >>>>>> [1] [2] that were in a similar spirit.
> >>>>>>
> >>>>>> A few more questions I hope will spark more conversation: How do the
> >>>>>> header files you linked in [3] overlap with these other efforts? Are
> >>>>>> those headers something we could|should "just" PR into apache/arrow
> >>>>>> and write up how to use them? If not what is the work to make them
> so
> >>>>>> that they could be (the answer of course could be design something
> >>>>>> else entirely and PR that!)?
> >>>>>>
> >>>>>> [1] https://github.com/paleolimbot/narrow
> >>>>>> [2] https://paleolimbot.github.io/narrow/articles/why-narrow.html
> >>>>>> [3]
> >>>> https://github.com/paleolimbot/geoarrow-cpp/tree/main/src/geoarrow/
> >>>>> internal/arrow-hpp
> >>>>>>
> >>>>>> -Jon
> >>>>>>
> >>>>>> -Jon
> >>>>>>
> >>>>>>
> >>>>>> On Wed, May 25, 2022 at 9:29 AM Dewey Dunnington <
> >>>> de...@voltrondata.com>
> >>>>> wrote:
> >>>>>>>
> >>>>>>> I'm writing to gauge interest in a set of helpers in C and/or C++
> >>>> for
> >>>>>>> reading/exporting Arrow C Data interface structures. My use-case is
> >>>>>>> building Arrow geospatial support in R [1], and while the set of
> >>>>> helpers
> >>>>>>> I've been using [2] has served the purpose of me writing about the
> >>>>>>> opportunities for Arrow + geospatial [3], I would like to rewrite
> >>>> the
> >>>>>>> prototype based on something developed by/with the Arrow community.
> >>>>>>>
> >>>>>>> Does a set of C/C++ helpers for Arrow C Data interface structures
> >>>>> already
> >>>>>>> exist? *Should* it exist?
> >>>>>>>
> >>>>>>> If it doesn't, what should the name/scope of that library be? The
> >>>> names
> >>>>>>> 'nanoarrow', 'narrow', 'sparrow', and 'arrow-hpp' have all
> >>>> surfaced in
> >>>>> my
> >>>>>>> limited discussion of this so far. For the purpose of starting the
> >>>>>>> discussion, I'll posit that the library should include helpers to
> >>>>>>> allocate/destroy C Data interface structures, a schema metadata
> >>>>>>> encoder/decoder, validation of a schema/array pair, and something
> >>>> like
> >>>>> the
> >>>>>>> ArrayBuilder C++ class.
> >>>>>>>
> >>>>>>> [1]
> >>>> https://lists.apache.org/thread/yb7p9wpg3k128njskhwj9j788opb67g7
> >>>>>>> [2]
> >>>>>>>
> >>>> https://github.com/paleolimbot/geoarrow-cpp/tree/main/src/geoarrow/
> >>>>> internal/arrow-hpp
> >>>>>>> [3]
> >>>>>>> https://docs.google.com/document/d/
> >>>>> 1A6e3XCerjhXVFHBDaoAlBBNFb2HG4RB9SVRpuBru7E4/edit?usp=sharing
> >>>>>
> >>>>>
> >>>>
> >>>
> >
>

Reply via email to