The format change is ARROW-6836 ... add a custom_metadata:[KeyValue] field to the Footer table in File.fbs
The other change (slicing a recordbatch to honor RecordBatch.length rather than array length if the former is smaller) will hopefully not affect the format. On Wed, Oct 9, 2019 at 11:55 PM Wes McKinney <wesmck...@gmail.com> wrote: > Hi John, > > Since the 1.0.0 release is focused on Format stability, probably the > only real "blockers" will be ensuring that we have hardened multiple > implementations (in particular C++ and Java) of the columnar format as > specified with integration tests to prove it. The issues you listed > sound more like C++ library changes to me? > > If you want to propose Format-related changes, that would need to > happen right away otherwise the ship will sail on that. > > - Wes > > On Wed, Oct 9, 2019 at 9:08 PM John Muehlhausen <j...@jgm.org> wrote: > > > > ARROW-5916 > > ARROW-6836/6837 > > > > These are of particular interest to me because they enable recordbatch > > "incrementalism" which is useful for streaming applications: > > > > ARROW-5916 allows a recordbatch to pre-allocate space for future records > > that have not yet been populated, making it safe for readers to consume > the > > partial batch. > > > > ARROW-6836/6837 allows a file of record batches to be extended at the > end, > > without re-writing the beginning, while including the idea that the > > custom_metadata may change with each update. (custom_metadata in the > > Schema is not a good candidate because Schema also appears at the > beginning > > of the file.) > > > > While these are not blockers for me quite yet, they soon will be! If I > > wanted to ensure that these are in 1.0, what is my deadline for > > implementation and test cases? Can such a note be made on the wiki? > > Should I change the priority in Jira? > > > > Thanks, > > John > > > > On Wed, Oct 9, 2019 at 2:57 PM Neal Richardson < > neal.p.richard...@gmail.com> > > wrote: > > > > > Congratulations everyone on 0.15! I know a lot of hard work went into > > > it, not only in the software itself but also in the build and release > > > process. > > > > > > Once you've caught your breath from the release, we should start > > > thinking about what's in scope for our next release, the big 1.0. To > > > get us started (or restarted, since we did discuss 1.0 before the > > > flatbuffer alignment issue came up), I've created > > > https://cwiki.apache.org/confluence/display/ARROW/Arrow+1.0.0+Release > > > based on our past release wiki pages. > > > > > > A good place to begin would be to list, either in "blocker" Jiras or > > > bullet points on the document, the key features and tasks we must > > > resolve before 1.0. For example, I get the sense that we need to > > > overhaul the documentation, but that should be expressed in a more > > > concrete, actionable way. > > > > > > Neal > > > >