Hi Raphael,
I think this is indeed a documentation mistake, it should say 0!
For exeactly these reasons you mentioned I determined that it is best
to leave the null count field always 0 for RLE arrays. This way it is
consistent with union types, at least.
RunLengthEncoded data should not contain
> {
> length: 2
> offset: 6
> rle: {
> length: 1 // actually physical length
> offset: 2
> buffer: [3, 5,8]
> }
> values: {
> length: 1
> offset: 2
> buffer: [5, 6, 7]
> }
> }
> Does this make sense?
I think this is a valid way o
;m not sure I understand this, could you provide an example
> > > of
> > > the
> > > > > problem
> > > > > that the child array solves?
> > > > >
> > > > >
> > > > >
> > > > >
> &g
Hello Everyone,
Recently, I have implemented support for run-length encoding in Arrow
C++. So far my implementation is split into different subtasks of
ARROW-16771 (https://issues.apache.org/jira/browse/ARROW-16771).
I have (draft) PRs available for:
- general handling of RLE in arrow C++, Type,
I created a Jira for adding RLE as ARROW-16771, and draft PRs:
- https://github.com/apache/arrow/pull/13330
Encode/Decode functions for (currently fixed width types only)
- https://github.com/apache/arrow/pull/1
For updating docs
Best,
Tobias
Am Dienstag, dem 31.05.2022 um 17:13 -0500 s
Am Freitag, dem 03.06.2022 um 09:32 -0700 schrieb Micah Kornfield:
> >
> > Thinking about compatibility with existing software, RLE could
> > possibly
> > even made an Extension Type that follows the layout of a struct of
> > int32 and the encoded value type. I'm wondering wether this would
> > be
> Well, Arrow C++ does not have a notion of encoding distinct from the
> data type. Adding such a notion would risk breaking compatibility for
> all existing software that hasn't been upgraded to dispatch based on
> encoding.
Thinking about compatibility with existing software, RLE could possibl
Am Dienstag, dem 31.05.2022 um 12:41 -0700 schrieb Micah Kornfield:
>
> - Should we allow multiple runs of the same value following each
> other?
> > Otherwise we would either need a pass to correct this after a lot
> > of
> > operations, or make RLE-aware versions of thier kernels.
>
> Is there
Hi,
Am Dienstag, dem 31.05.2022 um 21:12 +0200 schrieb Antoine Pitrou:
>
> Hi,
>
> Le 31/05/2022 à 20:24, Tobias Zagorni a écrit :
> > Hi, I'm currently working on adding Run-Length encoding to arrow. I
> > created a function to dictionary-encode arrays here (cur
Hi, I'm currently working on adding Run-Length encoding to arrow. I
created a function to dictionary-encode arrays here (currently only for
fixed length types):
https://github.com/apache/arrow/compare/master...zagto:rle?expand=1
The general idea is that RLE data will be a nested data type, with a
10 matches
Mail list logo