Hi,
I am reading about iceberg and am quite new to this.
This puffin would be an index from key to data file. Other use cases of
Puffin, such as statistics are at a per file level if I understand
correctly.
Where would the puffin about key->data file be stored? It is a property of
the entire table
+1 (binding)
Thanks for running this release Kevin!
- Verified signatures and checksum
- Checked for licenses
- Installed and ran tests
- Did some local testing
Kind regards,
Fokko
Op za 9 nov 2024 om 00:01 schreef Drew :
> +1 (non-binding)
>
> - verified signature and checksum
> - verified RA
JB, this is what we do, we write Equality Deletes and periodically convert them to Positional Deletes. We could probably index the keys, maybe partially index using bloom filters, the best would be to put those bloom filters inside puffin. Shani.On 9 Nov 2024, at 11:11, Jean-Baptiste Onofré wrote:
Great !
Thanks Russell for driving this release !
Regards
JB
On Fri, Nov 8, 2024 at 4:33 PM Russell Spitzer
wrote:
>
> I'm pleased to announce the release of Apache Iceberg 1.7.0!
>
> Apache Iceberg is an open table format for huge analytic datasets. Iceberg
> delivers high query performance fo
Hi,
I like the idea. My only comment is probably to use versions instead
of check marks, but all good :)
Thanks !
Regards
JB
On Fri, Nov 8, 2024 at 3:33 PM Russell Spitzer
wrote:
>
> Sounds like a great idea to me
>
> On Fri, Nov 8, 2024 at 7:58 AM Renjie Liu wrote:
>>
>> Hi:
>>
>> As iceberg
Hi,
I agree with Peter here, and I would say that it would be an issue for
multi-engine support.
I think, as I already mentioned with others, we should explore an
alternative.
As the main issue is the datafile scan in streaming context, maybe we could
find a way to "index"/correlate for positiona