Apologies for the very delayed reply.
Does unframed LZ4 provide a checksum of the content before compression?
I don't believe so, we would have need to add basic minimal metadata like
checksum/uncompressed length. I think this is still fairly simple compared
to implementing the block format.
O
Thanks for putting the spec PRs together, Ryan!
A bit of context below.
The concept of DVs is not external to Iceberg. We have been using Roaring
bitmaps (aka DVs) as an in-memory representation for position deletes,
which allowed us to support vectorized reads and buffer out-of-order
positions i
Hi
Thanks for the PRs ! I reviewed Anton's document, I will do a pass on the PRs.
Imho, it's important to get feedback from query engines, as, if delete
vectors is not a problem per se (it's what we are using as internal
representation), the use of Puffin files to store it is "impactful"
for the