Re: RLE array slicing

Tobias Zagorni Thu, 15 Sep 2022 05:23:25 -0700

> {
>     length: 2
>     offset: 6
>     rle: {
>         length: 1 // actually physical length
>         offset: 2
>         buffer: [3, 5,8]
>     }
>     values: {
>        length: 1
>        offset: 2
>        buffer: [5, 6, 7]
>     }
> }
> Does this make sense?


I think this is a valid way of doing it. There are 2 problems I see
with this variant:

- The Slice function would need to be modified to to specially handle 
RLE. Slicing RLE based on a logical offset will now always be O(log(n))
. Slicing by just setting offset/length of the main array will no
longer work. (which may not be that bad performance-wise, since in a
lot of cases we can't get around resolving the phyiscal offset at some
point).

- How would we slice nested types, i.e a struct where some fields are
run-length encoded?
  - We don't touch the rle fields since they are struct fields and
  these are not touched during slicing -> for anything where RLE is
  nested there is still only a logical offset. Now performance
  characteristics are also inconsistent between rle and struct<rle,...>
  - We change the RLE child's run ends/values arrays. Now we have a
    struct field that is no longer a valid array on its own, likely
    breaking a lot of things

> Apologies Weston, if this is what you were getting at, but I think it
> is
> slightly different because you talked about 0 offsets in your
> example.
> 
> 
> > > So a slice at 4 would be:
> > > Run ends: [5, 8]
> > > Values: [6, 7]
> 
> 
> How do you interpret that?
> > Naively, that means the logical values [6, 6, 6, 6, 6, 7, 7, 7]
> > It doesn't seem right...

we could find the logical offset by looking into run_ends_array[-1] if
run_ends_array.offset is not 0. That said the current code does not
support that.
> 

Best,
Tobias

Re: RLE array slicing

Reply via email to