Re: [VOTE] Release Apache Iceberg 1.5.1 RC0

2024-04-19 Thread Eduard Tudenhoefner
+1 (non-binding) * validated checksum and signature * checked license & ran RAT checks * ran build and tests with JDK11 When verifying signatures I noticed this warning: gpg: assuming signed data in 'apache-iceberg-1.5.1.tar.gz' gpg: Signature made Thu 18 Apr 2024 09:03:16 PM CEST gpg:

Re: [Proposal] Add support for Materialized Views in Iceberg

2024-04-19 Thread Ajantha Bhat
+1 for the proposal. - Ajantha On Fri, Apr 19, 2024 at 7:29 AM Benny Chow wrote: > +1 for separate view and table objects. Walaa's Spark > implementation demonstrates how little change it takes on the Iceberg APIs > to start sharing MVs between engines. > > Thanks > Benny > > On Thu, Apr 18, 2

Re: Flink table maintenance

2024-04-19 Thread Péter Váry
Hi Gen, *Unaligned checkpoint & AsyncIO* Let's talk about a concrete example: DataFileRewrite. The task has 3 steps: 1. Planning - this creates multiple RewriteFileGroups, each of which contains the list of small files which should be compacted to a single new file 2. Rewriting data f

Re: Flink table maintenance

2024-04-19 Thread Zhu Zhu
Hi Peter, > Flink job doing a quick small file reduction after collecting several commits Triggering maintenance tasks as soon as possible is a valid point to me. But I'm not sure about the priority of it, compared to maintenance tasks which may happen with a delay of few seconds. > Single, cont

Re: [Proposal] Add support for Materialized Views in Iceberg

2024-04-19 Thread Renjie Liu
+1 for this proposal. On Fri, Apr 19, 2024 at 3:40 PM Ajantha Bhat wrote: > +1 for the proposal. > > - Ajantha > > On Fri, Apr 19, 2024 at 7:29 AM Benny Chow wrote: > >> +1 for separate view and table objects. Walaa's Spark >> implementation demonstrates how little change it takes on the Icebe

FlinkFileIO implementation

2024-04-19 Thread Péter Váry
Hi Iceberg Team, Flink has its own FileSystem implementation. See: https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/filesystems/overview/ . This FileSystem already has several implementations: - Hadoop - Azure - S3 - Google Cloud Storage - ... As a general rule

Re: [Proposal] Add support for Materialized Views in Iceberg

2024-04-19 Thread John Zhuge
+1 on separate view and table metadata I'd like to share our experience of such a design at Netflix for years. The changes to the view spec are minimal and there are no changes to the Iceberg table metadata other than tracking an additional table property for capturing freshness. The storage table