On Sun, Jan 27, 2019 at 1:08 PM Neville Dipale <nevilled...@gmail.com>
wrote:

> Hi Antoine,
>
> I've given your response some thought.
>
> I'm thinking more looking at the computational aspect of Arrow. I agree
> that for representing and sharing data, RecordBatches achieve the purpose.
>
> I came across ChunkedArray, Column and Table while I was trying to create a
> dataframe library in Rust. The other languages already benefit from these 3
> already implemented, but for Rust I've had to try create them myself.
> This is what led me to asking the question, because the various languages
> that I've seen so far, seem to follow the same kind of standard re. both
> the structure and methods to create/interact with chunked arrays, columns,
> and tables.
>
> [1] Go Tables:
> https://github.com/apache/arrow/blob/master/go/arrow/array/table.go


there's also this WIP dataframe package being built on top of Arrow:
-  https://github.com/gonum/exp/pull/19

-s


> [2] CPP Tables:
> https://github.com/apache/arrow/blob/master/cpp/src/arrow/table.cc
> [3] JS Tables: https://github.com/apache/arrow/blob/master/js/src/table.ts
> [4] Ruby:
>
> https://github.com/apache/arrow/blob/master/ruby/red-arrow/lib/arrow/table.rb
> [5] Python, pyarrow.Table
>
> While going through the source, I didn't find anything for Java, and that's
> swayed me to think that maybe Tables don't need standardising as each
> implementation would likely implement them differently (or not implement
> them).
>
> Regards
> Neville
>
> On Fri, 25 Jan 2019 at 20:56, Antoine Pitrou <anto...@python.org> wrote:
>
> >
> > Hello Neville,
> >
> > I don't know if Tables need standardizing.  Record Batches are part of
> > the spec (*), and they are the basic block for exchanging and sharing
> > tabular data.  Depending on your application, you might exchange a
> > stream of Record Batches, or a fixed-length sequence thereof (in which
> > case you have a "Table").
> >
> > (*) see https://arrow.apache.org/docs/metadata.html
> >
> > (reading that spec though, it's not obvious to me why the Record Batch
> > definition doesn't reference a Schema)
> >
> > Regards
> >
> > Antoine.
> >
> >
> > Le 25/01/2019 à 19:48, Neville Dipale a écrit :
> > > Hi Arrow developers,
> > >
> > > I've been looking at the various language impls, and although a Table
> > isn't
> > > currently part of the spec, it seems to be implemented in CPP, Python,
> > Go,
> > > JS (and perhaps other languages).
> > >
> > > Are there plans of standardising these and adding them to the spec?
> > >
> > > I'm asking because I'm working on a dataframe implementation for Rust (
> > > https://github.com/nevi-me/rust-dataframe), and I've started trying to
> > > implement columns and tables with the intention to upstream them if I
> get
> > > them right.
> > >
> > > Regards
> > > Neville
> > >
> >
>

Reply via email to