Hi Liya Fan,
Perhaps comment on the original thread?  This differs from my proposal in
terms on details of encoding.  For RLE, I proposed encoding run end indices
instead of run-lengths.  This allows for sublinear access to elements at
the cost of potentially larger bit-widths for the lengths.


Thanks,
Micah

On Wed, Aug 21, 2019 at 6:50 PM Fan Liya <liya.fa...@gmail.com> wrote:

> Hi Wes,
>
> Thanks for the good suggestion.
> It is intended to be sent through IPC. So it should implement FieldVector,
> not just ValueVector.
>
> This can be considered a sub-item of Micah's proposal about
> compression/decompression.
> I will spend more time on that discussion.
>
> Best,
> Liya Fan
>
> On Wed, Aug 21, 2019 at 9:34 PM Wes McKinney <wesmck...@gmail.com> wrote:
>
> > hi Liya,
> >
> > Do you intend to be able to send RLE vectors using the IPC protocol?
> > If so, we need to spend some time on Micah's discussion about
> > sparseness and encodings/compression.
> >
> > - Wes
> >
> > On Wed, Aug 21, 2019 at 7:33 AM Fan Liya <liya.fa...@gmail.com> wrote:
> > >
> > > Dear all,
> > >
> > > RLE (run length encoding) is a widely used encoding/decoding technique.
> > > Compared with other encoding/decoding techniques, it is easier to work
> > with
> > > the encoded data.
> > >
> > > We want to provide an RLE vector implementation in Arrow. The design
> > > details include:
> > >
> > > 1. RleVector implements ValueVector.
> > > 2. the data structure of RleVector includes an inner vector, plus a
> > > repetition buffer.
> > > 3. we do not provide random access over the RleVector
> > > 4. In the future, we will provide iterators to access the vector in
> > > sequence.
> > > 5. RleVector does not support update, but supports appending.
> > > 6. In the future, we will provide encoder/decoder to efficiently
> > transform
> > > encoded/decoded vectors.
> > >
> > > Please give your valuable feedback.
> > >
> > > Best,
> > > Liya Fan
> >
>

Reply via email to