Forgot to say, My vote is +1 (binding). On Thu, Nov 21, 2019 at 12:09 PM Wes McKinney <wesmck...@gmail.com> wrote:
> +1 (binding). Thanks Micah > > On Wed, Nov 20, 2019 at 10:42 PM Micah Kornfield <emkornfi...@gmail.com> > wrote: > > > > Hello, > > As discussed on [1], I've proposed clarifications in a PR [2] that > > clarifies: > > > > 1. It is not required that all dictionary batches occur at the beginning > > of the IPC stream format (if a the first record batch has an all null > > dictionary encoded column, the null column's dictionary might not be sent > > until later in the stream). > > > > 2. A second dictionary batch for the same ID that is not a "delta batch" > > in an IPC stream indicates the dictionary should be replaced. > > > > 3. Clarifies that the file format, can only contain 1 "NON-delta" > > dictionary batch and multiple "delta" dictionary batches. Dictionary > > replacement is not supported in the file format. > > > > 4. Add an enum to dictionary metadata for possible future changes in > what > > format dictionary batches can be sent. (the most likely would be an array > > Map<Int, Value>). An enum is needed as a place holder to allow for > forward > > compatibility past the release 1.0.0. > > > > If accepted there will be work in all implementations to make sure that > > they cover the edge cases highlighted and additional integration testing > > will be needed. > > > > Please vote whether to accept these additions. The vote will be open for > at > > least 72 hours. > > > > [ ] +1 Accept these change to the specification > > [ ] +0 > > [ ] -1 Do not accept the changes because... > > > > Thanks, > > Micah > > > > > > [1] > > > https://lists.apache.org/thread.html/d0f137e9db0abfcfde2ef879ca517a710f620e5be4dd749923d22c37@%3Cdev.arrow.apache.org%3E > > [2] https://github.com/apache/arrow/pull/5585 >