Hi Brian, It's not one record batch per field. Each field describes a column in the schema. Record batches are partitions of the dataset. As such all record batches have the same schema which is defined in the footer. There can be any number of record batches for a given schema.
Then in each record batch: - there are as many FieldNodes as there are Fields total in the schema tree. - For each field the buffer count is defined by the layout attribute in Field. IHTH, Julien On Thu, Sep 8, 2016 at 9:15 AM, Brian Hulette <bhule...@ccri.com> wrote: > Hi all, > > I'm very interested in the Arrow file format - I would eventually like > to use it to export data in a columnar format that can be read directly > in a browser through a Javascript library. I've been reviewing the > specification and Julien's Java implementation, and I'm a little bit > confused about the relationship between the Schema in the footer and the > record batch(es) > > If a schema is referring to multiple record batches, is it assumed that > the first fields in the schema refer to the first record batch, until > all of its Buffers and FieldNodes are accounted for, then the next set > of fields refer to the next record batch, and so on? > > If so, it doesn't seem like the current implementation supports this > behavior. Which is fine, I just want to make sure I understand. > > Thanks, > > Brian Hulette > > -- Julien