Re: [VOTE] Release Apache Iceberg Rust 0.4.0 RC2

2024-12-17 Thread Renjie Liu
+1 binding. Did following verification: [*] Download links are valid. [*] Checksums and signatures. [*] LICENSE/NOTICE files exist [*] No unexpected binary files [*] All source files have ASF headers [*] Can compile from source Running `make test` in following platforms and it works! - macos + m

Re: ​[discuss] Allow 200 responses for HEAD requests in REST API

2024-12-17 Thread Eduard Tudenhöfner
I agree with Yufei's observation. Changing the return code in the spec from 204 to 200 will just cause additional downstream work that doesn't seem worth it. Returning 204 makes the API also very explicit in telling that the request succeeded but that there's no content in the response that the cli

Re: [VOTE] Release Apache Iceberg Rust 0.4.0 RC2

2024-12-17 Thread Xuanwo
+1 non-binding Thank you for carrying this release, seems nice! [x] Download links are valid. [x] Checksums and signatures. :) for i in *.tar.gz; do gpg --verify $i.asc $i sha512sum -c $i.sha512 done gpg: Signature made Wed 18 Dec 2024 09:01:45 AM CST gpg:using RSA key 736A14A5

Re: Optimize object lookup in REST catalog

2024-12-17 Thread Vladimir Ozerov
Hi Piotr, Yufei, Thanks for the feedback. In addition to a single object lookup and namespace listing, is there anything else that can potentially help query engines reduce latency during semantic analysis? As an example, maybe a bulk object lookup? Like, you have 10 objects in a query. Usually,

Re: [Discuss] Document Snapshot Summary Optional Fields for Standardization

2024-12-17 Thread Honah J.
Thank you all for the feedback! It appears we have reached a consensus on documenting the snapshot summary fields. Additionally, there is a preference to document these fields outside the main body of the spec and make sure they are not tied to the spec version. Two options have been suggested:

Re: Optimize object lookup in REST catalog

2024-12-17 Thread Yufei Gu
Seems a nice optimization. I also echo Piotr's point about the list endpoints. Either a `relation` or a `table-like` are good to have. Looking forward to a formal proposal! Yufei On Thu, Dec 5, 2024 at 5:37 AM Piotr Findeisen wrote: > Hi > > I like the idea to just "get relation" to get the re

[VOTE] Drop Hive runtime

2024-12-17 Thread Manu Zhang
Hi all, Thanks for sharing your ideas in the discussion of Hive support[1]. We have a consensus to drop Hive runtime and upgrade Hive metastore connector to Hive 4. However, it looks like we can't upgrade metastore support till Spark 4[2]. Hence, I went on to create a separate PR to remove Hive ru

[VOTE] Release Apache Iceberg Rust 0.4.0 RC2

2024-12-17 Thread Sung Yun
Hello, Apache Iceberg Rust Community, This is a call for a vote to release Apache Iceberg rust version v0.4.0-rc.2. The tag to be voted on is v0.4.0-rc.2. The release candidate: https://dist.apache.org/repos/dist/dev/iceberg/apache-iceberg-rust-0.4.0-rc.2/ Keys to verify the release candidate:

Re: [VOTE] Release Apache Iceberg Rust 0.4.0 RC1

2024-12-17 Thread Sung Yun
Hi folks, As suspected above, the source package was signed with a different set of signing keys by accident. As the release manager, I will close this VOTE thread and reopen the vote with a new release candidate, with the corrected signature. Thanks again to Kevin for quickly testing out the rel

Re: [VOTE] Release Apache Iceberg Rust 0.4.0 RC1

2024-12-17 Thread Sung Yun
Hi Kevin, Thanks for the speedy response, and for reporting the issue! It sounds like the wrong credentials may have been used for signing. And my self-verifying the signature on the same machine had glossed over that. I will re-verify the signature once I am back at my desk and follow up on the

Re: REST catalog high availability

2024-12-17 Thread Vladimir Ozerov
Hi Jean, Thanks for the response, I agree with all points. For reference, you mentioned Apache Ignite - I worked on it for many years, and used to be an active committer/PMC there. This project is a very good example of how multiple failures to keep the complexity under control significantly slow

Re: [DISCUSS] Remove snapshot-id from IRC SetStatisticsUpdate

2024-12-17 Thread Marc Cenac
+1 to removing this redundancy in the REST spec and Java implementation On Tue, Dec 17, 2024 at 12:10 PM Kevin Liu wrote: > Hey Christian, > > Thanks for bringing this up! We also noticed this issue while implementing > table statistics in Python [1]. > I'm in favor of removing the outer field.

Re: [VOTE] Release Apache Iceberg Rust 0.4.0 RC1

2024-12-17 Thread Kevin Liu
Hey Sung, Thanks for working on the 0.4.0 release! I went through a few steps to verify this release and ran into an issue verifying the signature. Cannot check the signature: ``` ➜ curl https://downloads.apache.org/iceberg/KEYS -o KEYS gpg --import KEYS ➜ gpg --verify apache-iceberg-rust-0.4.0-

[VOTE] Release Apache Iceberg Rust 0.4.0 RC1

2024-12-17 Thread Sung Yun
Hello, Apache Iceberg Rust Community, This is a call for a vote to release Apache Iceberg rust version v0.4.0-rc.1. The tag to be voted on is v0.4.0-rc.1. The release candidate: https://dist.apache.org/repos/dist/dev/iceberg/apache-iceberg-rust-0.4.0-rc.1/ Keys to verify the release candidate:

Re: ​[discuss] Allow 200 responses for HEAD requests in REST API

2024-12-17 Thread Yufei Gu
The distinction between 200 and 204 is subtle enough that I'm comfortable using them interchangeably in this context. My main concern is that, if we make this change, all clients—except for PyIceberg—will need to be updated to support both 200 and 204, since a server could return either status code

​[discuss] Allow 200 responses for HEAD requests in REST API

2024-12-17 Thread Kevin Liu
Hey folks, I’d like to propose adding status code 200 as a valid response for HEAD requests in the Catalog REST API. Currently, the following HEAD requests return status code 204 for a successful response: * namespaceExists

Re: [DISCUSS] Remove snapshot-id from IRC SetStatisticsUpdate

2024-12-17 Thread Kevin Liu
Hey Christian, Thanks for bringing this up! We also noticed this issue while implementing table statistics in Python [1]. I'm in favor of removing the outer field. Since this is part of the spec change, we would need to follow the proper deprecation and removal path, similar to what we did for `la

Re: 1.7.1 breaking change related to ADLS support

2024-12-17 Thread Jean-Baptiste Onofré
Hi Cheng, Thanks for the update. The issue appeared in the test, but I guess it can also impact runtime use. Let me take a look at the update PR. Regards JB On Tue, Dec 17, 2024 at 3:06 PM Cheng Pan wrote: > > Spark 3.5.4 (under RC3 voting) introduces a breaking change that requires > Iceberg

[DISCUSS] Auth Manager API – remaining tasks

2024-12-17 Thread Alex Dutra
Hi all, As you know, the original Auth Manager PR [1] has been closed because it was deemed too big for review. The reviewers, after 4 months, requested to split the PR into smaller PRs and merge them one by one. I am planning six PRs in total: * Auth Manager API part 1: HTTPRequest, HTTPHeader

Re: 1.7.1 breaking change related to ADLS support

2024-12-17 Thread Cheng Pan
Spark 3.5.4 (under RC3 voting) introduces a breaking change that requires Iceberg to be patched[1]. Please include Spark 3.5.4 support in the patch release. Since some users are sticky to Java 8, it would be great if Iceberg also releases a new patch version of 1.6.x, otherwise they won’t be ab

Re: REST catalog high availability

2024-12-17 Thread Jean-Baptiste Onofré
Hi Vladimir As I said in my previous email, I can already "inject" the PoolingHttpClientConnectionManager in the client. So, technically speaking, I think it's do-able. So, we can always document how to use that with several endpoints. I understand your points and they make sense. However, implem

Re: REST catalog high availability

2024-12-17 Thread Vladimir Ozerov
Hi, Thank you for the feedback. I understand the concerns about adding more and more features to the protocol, especially if they might be implemented elsewhere. And every added bit of complexity should have clear cost/benefit ratio. Iceberg is becoming the de-facto standard for multiple workload

Re: 1.7.1 breaking change related to ADLS support

2024-12-17 Thread Jean-Baptiste Onofré
That works for me. In the meantime, I will draft a proposal (in terms of content) for 1.8.0. I volunteer to drive 1.7.2 release if needed. Regards JB On Tue, Dec 17, 2024 at 9:58 AM Fokko Driesprong wrote: > > Thanks for raising this Alex, > > I suggest doing a 1.7.2 patch release since we don'

Re: 1.7.1 breaking change related to ADLS support

2024-12-17 Thread Eduard Tudenhöfner
I agree with Fokko in that we should do a 1.7.2 release for people using ADLSFileIO On Tue, Dec 17, 2024 at 9:58 AM Fokko Driesprong wrote: > Thanks for raising this Alex, > > I suggest doing a 1.7.2 patch release since we don't want to leave the > 1.7.x version in a broken state for the ADLSFil

Re: 1.7.1 breaking change related to ADLS support

2024-12-17 Thread Fokko Driesprong
I took the liberty of creating a 1.7.2 milestone: https://github.com/apache/iceberg/milestone/52 Kind regards, Fokko Op di 17 dec 2024 om 09:58 schreef Fokko Driesprong : > Thanks for raising this Alex, > > I suggest doing a 1.7.2 patch release since we don't want to leave the > 1.7.x version in

Re: 1.7.1 breaking change related to ADLS support

2024-12-17 Thread Fokko Driesprong
Thanks for raising this Alex, I suggest doing a 1.7.2 patch release since we don't want to leave the 1.7.x version in a broken state for the ADLSFileIO. Kind regards, Fokko Op di 17 dec 2024 om 07:40 schreef Jean-Baptiste Onofré : > Hi Alex, > > It was exactly my concern (and question) when I d