@Micah Kornfield <[email protected]> Thanks a lot for your comments.
In the doc, we identify 3 problems for the current dictionary encoding use case (there can be more, so please give your valuable suggestions): 1. there should be a convenient way to provide access to both encoded/decoded data. 2. the constructor for the schema with dictionary is misleading. 3. we should provide a way to do the encoding/decoding during serialization/deserialization, so the encoding/decoding can be transparent to the user. The current PR provides a solution for problem 2, and it is a relatively small change, so we think we can merge this PR first. Solutions for problem 1 and 3 should be chosen carefully so as not to affect existing APIs. So more discussions/designs for problems 1 and 3 are desired. Do you think it reasonable? Best, Liya Fan On Wed, Jun 12, 2019 at 4:01 PM Micah Kornfield <[email protected]> wrote: > Hi Liya Fan, > Thanks you for doing this. I need to take a closer look at the PR in > question and the dictionary encoding code but this seems like it is on the > right track. > > Could other java contributors with more familiarity in the space look over > the document to make sure it makes sense to them? > > Thanks, > Micah > > On Mon, Jun 10, 2019 at 2:23 AM Fan Liya <[email protected]> wrote: > > > Hi all, > > > > This is concerning issue ARROW-3396. > > > > I have summarized the problem (please see if my understanding is > correct), > > and proposed some solutions to it. Please give your valuable feedback. > > For details, please see: > > > > > > > https://docs.google.com/document/d/1Y2E6RbZkUj3SwuEJrlEjaeIPmCA1SIsi9wmbJmVlB2I/edit?usp=sharing > > > > Thank you in advance. > > > > Best, > > Liya Fan > > >
