Re: [Announce] Apache Iceberg Community Meetup in SF and Seattle

2025-02-24 Thread Kevin Liu
Hi everyone, Both the Seattle and SF Iceberg Community meetup will be happening this Thursday 2/27! For Seattle, we have a great line-up of speakers focused on the theme of *interoperability*! * Open Table Format Interoperability with Apache XTable - Rahil Chertara & Anand Sivaram * A general mod

Re: [VOTE] Java implementation notes around current-snapshot-id

2025-02-24 Thread Eduard Tudenhöfner
+1 On Tue, Feb 25, 2025 at 7:31 AM Steven Wu wrote: > +1 > > On Mon, Feb 24, 2025 at 10:13 PM Péter Váry > wrote: > >> +1 >> >> On Tue, Feb 25, 2025, 04:16 Steve Zhang >> wrote: >> >>> +1 (nb) >>> Thanks, >>> Steve Zhang >>> >>> >>> >>> On Feb 24, 2025, at 6:32 PM, Renjie Liu wrote: >>> >>> +

Re: [VOTE] Java implementation notes around current-snapshot-id

2025-02-24 Thread Steven Wu
+1 On Mon, Feb 24, 2025 at 10:13 PM Péter Váry wrote: > +1 > > On Tue, Feb 25, 2025, 04:16 Steve Zhang > wrote: > >> +1 (nb) >> Thanks, >> Steve Zhang >> >> >> >> On Feb 24, 2025, at 6:32 PM, Renjie Liu wrote: >> >> +1 >> >> On Tue, Feb 25, 2025 at 7:00 AM Szehon Ho >> wrote: >> >>> +1 >>> >>

Re: [VOTE] Release Apache PyIceberg 0.9.0rc2

2025-02-24 Thread Kevin Liu
+1 (non-binding) Downloaded from SVN * gpg: Good signature from "Drew Gallardo " [unknown] * Checksum OK * RAT checks passed. 1 extra ERROR line but is unrelated ``` ERROR: Ignored 0 lines in your exclusion files as comments or empty lines. ``` * Both unit and integration test passed with Python 3

Re: [VOTE] Java implementation notes around current-snapshot-id

2025-02-24 Thread Péter Váry
+1 On Tue, Feb 25, 2025, 04:16 Steve Zhang wrote: > +1 (nb) > Thanks, > Steve Zhang > > > > On Feb 24, 2025, at 6:32 PM, Renjie Liu wrote: > > +1 > > On Tue, Feb 25, 2025 at 7:00 AM Szehon Ho wrote: > >> +1 >> >> Thanks >> Szehon >> >> On Mon, Feb 24, 2025 at 2:52 PM rdb...@gmail.com >> wrote

Re: [VOTE] Release Apache Iceberg 1.8.1 RC1

2025-02-24 Thread Ajantha Bhat
+1 (non-binding) * validated checksum and signature * checked license docs & ran RAT checks * ran build and tests with JDK11 - Ajantha On Tue, Feb 25, 2025 at 9:52 AM Renjie Liu wrote: > +1 binding. > > Did following check: > 1. Verified sha512 > 2. Verified signature > 3. Run test and build >

Re: [VOTE] Release Apache Iceberg 1.8.1 RC1

2025-02-24 Thread Renjie Liu
+1 binding. Did following check: 1. Verified sha512 2. Verified signature 3. Run test and build 4. Run dev/check-license On Tue, Feb 25, 2025 at 7:46 AM Yuya Ebihara wrote: > +1 (non-binding) > > I verified that Trino CI is green with Iceberg 1.8.1. > Thanks for your work, Edward. > > BR, > Yuy

Re: [DISCUSS][Rust] Frequency of upgrading minimum supported rust version.

2025-02-24 Thread Xuanwo
Hi, renjie > My point is still the same, if iceberg uses lib A 0.1.9, and the 0.1.10 > version of lib A doesn't work with iceberg 0.3, the user of iceberg will face > either of: > 1. Forced to use the old version of lib A, without enjoying the performance / > bug fix of the new version of lib A

Re: [VOTE] Java implementation notes around current-snapshot-id

2025-02-24 Thread Steve Zhang
+1 (nb) Thanks, Steve Zhang > On Feb 24, 2025, at 6:32 PM, Renjie Liu wrote: > > +1 > > On Tue, Feb 25, 2025 at 7:00 AM Szehon Ho > wrote: >> +1 >> >> Thanks >> Szehon >> >> On Mon, Feb 24, 2025 at 2:52 PM rdb...@gmail.com >> mailt

Re: [DISCUSS][Rust] Frequency of upgrading minimum supported rust version.

2025-02-24 Thread Renjie Liu
Hi, Xuanwo: Thanks for pointing that out, and you're right about the version used. However, if iceberg depends on lib A 0.1.10, but app B depends on lib A > 0.1.9, users will finally get: > - iceberg 0.3 > - lib A 0.1.10 (the biggest if all lib A 0.1.X) My point is still the same, if iceberg us

Re: [VOTE] Java implementation notes around current-snapshot-id

2025-02-24 Thread Renjie Liu
+1 On Tue, Feb 25, 2025 at 7:00 AM Szehon Ho wrote: > +1 > > Thanks > Szehon > > On Mon, Feb 24, 2025 at 2:52 PM rdb...@gmail.com wrote: > >> +1 >> >> On Mon, Feb 24, 2025 at 12:26 PM Daniel Weeks wrote: >> >>> +1 >>> >>> On Mon, Feb 24, 2025, 11:00 AM Russell Spitzer < >>> russell.spit...@gma

Re: [VOTE] Deprecate or remove distinct_count

2025-02-24 Thread Ajantha Bhat
+1 to deprecate it again and remove it later on. I did some digging and found out that Dremio was interested in this field for secondary indexes. https://lists.apache.org/thread/z948wfssgvrpf9b3g6660gh5cxb2d3sn But we didn't make progress on that. - Ajantha On Tue, Feb 25, 2025 at 5:03 AM Scott

Re: [VOTE] Release Apache Iceberg 1.8.1 RC1

2025-02-24 Thread Yuya Ebihara
+1 (non-binding) I verified that Trino CI is green with Iceberg 1.8.1. Thanks for your work, Edward. BR, Yuya On Tue, Feb 25, 2025 at 2:20 AM Hussein Awala wrote: > +1 (non-binding) - Tested with Spark 3.5.4, JDK 17, S3 FileIO (MinIO), and > Hive3 as the catalog. > Validated the following oper

Re: [VOTE] Deprecate or remove distinct_count

2025-02-24 Thread Scott Cowell
Speaking for Dremio, I checked and we're not using distinct_counts anywhere, we interact with manifests exclusively through the Iceberg Java API which as mentioned doesn't support this field.I'm in favor of removing it, I didn't even know it existed as I tend to look at the Java DataFile/Conten

Re: [VOTE] Java implementation notes around current-snapshot-id

2025-02-24 Thread Szehon Ho
+1 Thanks Szehon On Mon, Feb 24, 2025 at 2:52 PM rdb...@gmail.com wrote: > +1 > > On Mon, Feb 24, 2025 at 12:26 PM Daniel Weeks wrote: > >> +1 >> >> On Mon, Feb 24, 2025, 11:00 AM Russell Spitzer >> wrote: >> >>> +1 >>> >>> On Mon, Feb 24, 2025 at 12:55 PM Fokko Driesprong >>> wrote: >>> >>>

Re: [VOTE] Deprecate or remove distinct_count

2025-02-24 Thread rdb...@gmail.com
I can provide some context here. The field is very old and when we realized that it was not only unused but also difficult to produce and use in practice (can't be combined) we deprecated the field. However, some folks from Dremio wanted to bring it back because they said they could store values th

Re: [DISCUSS] Rest Catalog 419 Response Code

2025-02-24 Thread rdb...@gmail.com
Yeah, I don't think that this response was used. We thought that it was needed but it can probably be safely removed as I'm not aware of any implementation that sent or handled it. If that's the right thing to do because there are other more standard mechanisms for sending more information about a

Re: [VOTE] Java implementation notes around current-snapshot-id

2025-02-24 Thread rdb...@gmail.com
+1 On Mon, Feb 24, 2025 at 12:26 PM Daniel Weeks wrote: > +1 > > On Mon, Feb 24, 2025, 11:00 AM Russell Spitzer > wrote: > >> +1 >> >> On Mon, Feb 24, 2025 at 12:55 PM Fokko Driesprong >> wrote: >> >>> Hi everyone, >>> >>> Recently, there was confusion >>>

[VOTE] Release Apache PyIceberg 0.9.0rc2

2025-02-24 Thread Drew
Hi Everyone, I propose that we release the following RC as the official PyIceberg 0.9.0 release. A summary of the high level features: - 235 new commits *High Level Features:* - Implemented support for Alibaba OSS protocol in PyArrowFileIO - Enabled Dynamic Overwrite capability - I

Re: [VOTE] Java implementation notes around current-snapshot-id

2025-02-24 Thread Daniel Weeks
+1 On Mon, Feb 24, 2025, 11:00 AM Russell Spitzer wrote: > +1 > > On Mon, Feb 24, 2025 at 12:55 PM Fokko Driesprong > wrote: > >> Hi everyone, >> >> Recently, there was confusion >> about >> valid values for the current-snapshot

[VOTE] Java implementation notes around current-snapshot-id

2025-02-24 Thread Fokko Driesprong
Hi everyone, Recently, there was confusion about valid values for the current-snapshot-id, which led to implementation notes in the spec. Thanks for the reviews so far and to Daniel fo

Re: [VOTE] Java implementation notes around current-snapshot-id

2025-02-24 Thread Russell Spitzer
+1 On Mon, Feb 24, 2025 at 12:55 PM Fokko Driesprong wrote: > Hi everyone, > > Recently, there was confusion > about > valid values for the current-snapshot-id, which led to implementation > notes

Re: [VOTE] Release Apache Iceberg 1.8.1 RC1

2025-02-24 Thread Hussein Awala
+1 (non-binding) - Tested with Spark 3.5.4, JDK 17, S3 FileIO (MinIO), and Hive3 as the catalog. Validated the following operations: - Creating a new table - Appending data - Merging into tables - Updating schema and properties - Scanning tables On Mon, Feb 24, 2025 at 2:49 PM Eduard Tudenhoefner

Re: [VOTE] Allow Row-Lineage with Equality Deletes

2025-02-24 Thread Russell Spitzer
Ok We are all on the same page :) I'll finish up last reviewer's comments and merge. On Fri, Feb 21, 2025 at 2:10 PM Christian Thiel wrote: > +1 (non-binding) > > On Thu, 20 Feb 2025 at 19:47, Honah J. wrote: > >> +1 >> >> Best, >> Honah >> >> On Thu, Feb 20, 2025 at 10:45 AM Yufei Gu wrote: >

[VOTE] Release Apache Iceberg 1.8.1 RC1

2025-02-24 Thread Eduard Tudenhoefner
Hi Everyone, I propose that we release the following RC as the official Apache Iceberg 1.8.1 release. The commit ID is *9ce0fcf0af7becf25ad9fc996c3bad2afdcfd33d* * This corresponds to the tag: *apache-iceberg-1.8.1-rc1* * https://github.com/apache/iceberg/commits/apache-iceberg-1.8.1-rc1 * https:

Re: [DISCUSS][Rust] Frequency of upgrading minimum supported rust version.

2025-02-24 Thread Xuanwo
Hi, renjie > 1. Iceberg 0.3 dependents library a 0.1 > 2. Application b dependents on iceberg 0.3 and library a 0.2 > 3. Library a 0.2 doesn't work with iceberg 0.3 > 4. Application b crashed in production. This example can't happen in rust. In rust, every version within 0.X is considered a brea

Re: [DISCUSS][Rust] Frequency of upgrading minimum supported rust version.

2025-02-24 Thread Renjie Liu
Thanks to xxchan's explanation, it seems that we need to resolve the dependency version first. 2.1 Always use the latest dependency version in Cargo.toml, and force > downstream applications to upgrade transitive dependencies at the same time. > Benefits: let downstream enjoy latest bug fixes and

Re: [DISCUSS] Cleanup unreferenced statistics files through DropTableData

2025-02-24 Thread Gabor Kaszab
Hi All, `remove_orphan_files` for sure drops the previous stat files, but in case you drop the table they will remain on disk forever. I don't have an answer here (I reviewed the above mentioned PR and raised these concerns there) but I think we should figure out a way to avoid accumulating unrefe

Re: [DISCUSS] Rest Catalog 419 Response Code

2025-02-24 Thread Alex Dutra
Hi all, Thanks Sung for your support – I will start a discussion thread soon on the topic of expected client behavior for both 401 and 419 response codes. And out of curiosity, I looked at how the AuthenticationTimeoutResponse type was currently handled in iceberg-core, and was surprised to see t