> {
> length: 2
> offset: 6
> rle: {
> length: 1 // actually physical length
> offset: 2
> buffer: [3, 5,8]
> }
> values: {
> length: 1
> offset: 2
> buffer: [5, 6, 7]
> }
> }
> Does this make sense?
I think this is a valid way of doing it. There are 2 problems I see
with this variant:
- The Slice function would need to be modified to to specially handle
RLE. Slicing RLE based on a logical offset will now always be O(log(n))
. Slicing by just setting offset/length of the main array will no
longer work. (which may not be that bad performance-wise, since in a
lot of cases we can't get around resolving the phyiscal offset at some
point).
- How would we slice nested types, i.e a struct where some fields are
run-length encoded?
- We don't touch the rle fields since they are struct fields and
these are not touched during slicing -> for anything where RLE is
nested there is still only a logical offset. Now performance
characteristics are also inconsistent between rle and struct<rle,...>
- We change the RLE child's run ends/values arrays. Now we have a
struct field that is no longer a valid array on its own, likely
breaking a lot of things
> Apologies Weston, if this is what you were getting at, but I think it
> is
> slightly different because you talked about 0 offsets in your
> example.
>
>
> > > So a slice at 4 would be:
> > > Run ends: [5, 8]
> > > Values: [6, 7]
>
>
> How do you interpret that?
> > Naively, that means the logical values [6, 6, 6, 6, 6, 7, 7, 7]
> > It doesn't seem right...
we could find the logical offset by looking into run_ends_array[-1] if
run_ends_array.offset is not 0. That said the current code does not
support that.
>
Best,
Tobias