Re: [VOTE] Update partition stats spec for V3

2025-02-04 Thread Jean-Baptiste Onofré
+1 (non binding) Regards JB On Sat, Feb 1, 2025 at 3:01 AM Anton Okolnychyi wrote: > > Hi all, > > I propose the following updates to our partition stats spec in V3: > > - Modify `position_delete_record_count` to include a sum of position deletes > across position delete files and DVs > - Keep

[Iceberg Summit 25] Registration is live!

2025-01-29 Thread Jean-Baptiste Onofré
Hi folks, I'm very happy to share that the registration for the Iceberg Summit 2025 is officially live! https://www.icebergsummit2025.com/ We are looking forward to see you there (in person or virtual ;)) ! Regards JB

[CANCEL][VOTE] Release Apache Iceberg 1.7.2 rc0

2025-01-29 Thread Jean-Baptiste Onofré
Hi folks, in order to merge the LICENSE/NOTICE improvement on our binary artifacts, I cancel this vote for rc0. As soon as the PRs will be merged, I will propose rc1 to vote. Regards JB On Sun, Jan 26, 2025 at 9:41 PM Jean-Baptiste Onofré wrote: > > Hi everyone, > > I propose tha

Re: [VOTE] Release Apache Iceberg 1.7.2 rc0

2025-01-28 Thread Jean-Baptiste Onofré
Sorry (again, it's a long day): I meant this PR: https://github.com/apache/iceberg/pull/12095 On Tue, Jan 28, 2025 at 7:38 PM Jean-Baptiste Onofré wrote: > > Just to be clear: we are working on new changes on NOTICE/LICENSE that > would be great to include in 1.7.x. > > Se

Re: [VOTE] Release Apache Iceberg 1.7.2 rc0

2025-01-28 Thread Jean-Baptiste Onofré
25 at 7:16 PM Jean-Baptiste Onofré wrote: > > -1 (non binding) > > I actually did a change that we should "partially revert": I changed > the NOTICE to use bundle name, which is not optimal. > > I propose to cancel this rc0 to create a rc1 fixing name in NOTICE. &g

Re: [VOTE] Release Apache Iceberg 1.7.2 rc0

2025-01-28 Thread Jean-Baptiste Onofré
-1 (non binding) I actually did a change that we should "partially revert": I changed the NOTICE to use bundle name, which is not optimal. I propose to cancel this rc0 to create a rc1 fixing name in NOTICE. Sorry about that. Regards JB On Sun, Jan 26, 2025 at 9:41 PM Jean-Bapti

Re: [VOTE] Release Apache Iceberg 1.7.2 rc0

2025-01-27 Thread Jean-Baptiste Onofré
re changes > than diffs between 1.7.2-rc0 and 1.7.1. > > Regards, > Manu > > On Mon, Jan 27, 2025 at 10:36 PM Jean-Baptiste Onofré > wrote: >> >> Hi, >> >> as 1.7.x is "broken", 1.7.2 makes sense. 1.8.0 brings new features. >> >>

Re: [DISCUSS] Adding RemoveSchemasUpdate update type to REST spec

2025-01-27 Thread Jean-Baptiste Onofré
Hi Gabor Sorry for the late reply. It makes sense to me. Thanks ! Regards JB On Wed, Jan 22, 2025 at 10:32 AM Gabor Kaszab wrote: > > Hi Iceberg Community, > > There has been a PR merged recently that enhances "expire snapshots" to drop > the unused partition specs. I myself started working n

Re: [VOTE] Release Apache Iceberg 1.7.2 rc0

2025-01-27 Thread Jean-Baptiste Onofré
https://lists.apache.org/thread/wvz5sd7pmh5ww1yqhsxpt1kwf993276j > > - Ajantha > > On Mon, Jan 27, 2025 at 2:11 AM Jean-Baptiste Onofré > wrote: >> >> Hi everyone, >> >> I propose that we release the following RC as the official Apache >> Iceberg 1.7.2 releas

[VOTE] Release Apache Iceberg 1.7.2 rc0

2025-01-26 Thread Jean-Baptiste Onofré
Hi everyone, I propose that we release the following RC as the official Apache Iceberg 1.7.2 release. The commit ID is c2105b2634becf68b3fdabd0ee6fb0b6e93d4f0c * This corresponds to the tag: apache-iceberg-1.7.2-rc0 * https://github.com/apache/iceberg/commits/apache-iceberg-1.7.2-rc0 * https://g

Re: [DISCUSS] Apache Iceberg (java) 1.8.0 release

2025-01-26 Thread Jean-Baptiste Onofré
l work >>> > with Fokko on this one. >>> > >>> > 2. We plan to do 1.8.0 in a couple of weeks (Amogh is the release >>> > manager). Due to still some WIP, we "revisited" the 1.8.0 release >>> > content: for instance, as best effort, we

Re: [VOTE] Add initial/write defaults to REST spec

2025-01-24 Thread Jean-Baptiste Onofré
+1 (non binding) It corresponds to the spec (initial/write default values). Thanks ! Regards JB On Fri, Jan 24, 2025 at 5:11 PM Daniel Weeks wrote: > > Everyone, > > I'd like to hold a quick vote for a small addition to the REST spec to > include the initial/write defaults introduced in v3 as

Re: [Discussion] Proposal for Iceberg Community Meetup Guidelines

2025-01-24 Thread Jean-Baptiste Onofré
Hi Kevin Thanks for bringing this discussion. It's a good intention. As a reminder, at The ASF, we are very strict in the trademarks and brand use. You can find some resources here: https://www.apache.org/foundation/marks/ and specifically for event: https://www.apache.org/foundation/marks/#eve

Re: [DISCUSS, VOTE] OpenAPI Metadata Update for EnableRowLineage

2025-01-23 Thread Jean-Baptiste Onofré
+1 (non binding) Regards JB On Wed, Jan 22, 2025 at 11:51 PM Russell Spitzer wrote: > > Hey Y'all > > Yet another Row Lineage Spec update. This adds a MetadataUpdate > EnableRowLineage to the REST Spec. We briefly talked today > about an alternative EnableFeature(Feature Name) API instead but i

Re: [VOTE] Document Snapshot Summary Optional Fields as Subsection of Appendix F in Spec

2025-01-22 Thread Jean-Baptiste Onofré
+1 (non binding) Regards JB On Tue, Jan 21, 2025 at 9:19 PM Honah J. wrote: > > Hi everyone, > > In the last VOTE thread on documenting snapshot summary optional fields, we > decided to move the documentation to a subsection of Appendix F – > Implementation Notes. Since this is a significant c

Re: [DISCUSS] Add metadata stats/metrics management on the REST Spec

2025-01-21 Thread Jean-Baptiste Onofré
> types of situations, but I'm just concerned that we would end up rebuilding > that same functionality to address all of the issues with exposing this > information more directly. > > I'm interested if there are more concrete proposals, but I'm a little > hesitant becaus

Re: [VOTE] Deprecate IRC snapshot-id Field of SetStatisticsUpdate

2025-01-21 Thread Jean-Baptiste Onofré
+1 (non binding) Thanks Christian ! Regards JB On Tue, Jan 21, 2025 at 8:25 AM Christian Thiel wrote: > > Hi everyone, > > based on good feedback on the [DISCUSS] thread [1] I would like to raise > a vote to deprecate the `snapshot-id` field of the `SetStatisticsUpdate` > in the IRC. It is redu

[DISCUSS] Add metadata stats/metrics management on the REST Spec

2025-01-21 Thread Jean-Baptiste Onofré
Hi folks, I know we don't want to "expose" the whole metadata tables in the REST api, but I would like to discuss adding metadata stats and metrics management. We are discussing this as part of the Apache Polaris TMS proposal. The purpose is: 1. To add interfaces to manage metadata stats and metr

Re: [Discuss][Vote] Spec Change - Add optional field added-rows to Snapshot for Row Lineage

2025-01-17 Thread Jean-Baptiste Onofré
+1 (non binding) Regards JB On Wed, Jan 15, 2025 at 5:59 PM Russell Spitzer wrote: > > Hi Everyone! > > PR: https://github.com/apache/iceberg/pull/11976/files > > Split out from #11948 > > Working on the row-lineage implementation made it clear that we needed a way > to get information from the

Re: [DISCUSS] Apache Iceberg 1.7.2 release

2025-01-16 Thread Jean-Baptiste Onofré
s, > Fokko > > Op wo 15 jan 2025 om 06:05 schreef Jean-Baptiste Onofré : > >> Hi Yuffei >> >> That makes sense to me. Do we have an ETA for this issue ? >> Are you working on a fix ? Do you need my help on this ? >> >> Thanks ! >> Regard

Re: [DISCUSS] Apache Iceberg (java) 1.8.0 release

2025-01-16 Thread Jean-Baptiste Onofré
ats and Auth Manager. 3. Assuming 1.8.0 will be released at the end of Jan/beginning of Feb, according to our "release cadence", what do you think about planning 1.9.0 in April ? Again with the main targets listed in (2). I tried to sum up what we discussed yesterday :) Thoughts ? Regard

Re: [DISCUSS] Apache Iceberg 1.7.2 release

2025-01-14 Thread Jean-Baptiste Onofré
issues/11922. > It'd be nice to include it in 1.7.2. > > Yufei > > > On Tue, Jan 14, 2025 at 2:00 AM Jean-Baptiste Onofré > wrote: >> >> Hi Fokko >> >> Thanks for the update. I will do a quick pass on GH issues and I will >> run the release (I wil

Re: [DISCUSS] Apache Iceberg 1.7.2 release

2025-01-14 Thread Jean-Baptiste Onofré
where needed. > > Kind regards, > Fokko > > > > > Op ma 13 jan 2025 om 08:32 schreef Jean-Baptiste Onofré : >> >> Hi Fokko >> >> It sounds good ! Thanks for the "reminder" :) >> >> I'm also fine either way for the release manager, I can

Re: [DISCUSS] Apache Iceberg 1.7.2 release

2025-01-12 Thread Jean-Baptiste Onofré
Hi Fokko It sounds good ! Thanks for the "reminder" :) I'm also fine either way for the release manager, I can tackle it (with you) or you do, everything is fine for me :) Thanks ! Regards JB On Mon, Jan 13, 2025 at 7:24 AM Fokko Driesprong wrote: > > Hi everyone, > > Over the weekend the last

[DISCUSS] Apache Iceberg (java) 1.8.0 release

2025-01-08 Thread Jean-Baptiste Onofré
Hi folks, We did Apache Iceberg 1.7.0 release on Nov 8, 2024. If we want to keep our release "pace", 1.8.0 should be released around mid February. I think we already have a good "train" of merged PRs (or should be merged soon): default values, REST auth improvements, dependencies updates, etc. W

[ANN] Apache Iceberg Summit 2025, dates, venue and CFP

2025-01-07 Thread Jean-Baptiste Onofré
Hi everyone, With this new year comes a new announcement: Apache Iceberg Summit 2025 ! Iceberg Summit 2025 is a hybrid event sanctioned by The Apache Software Foundation and organized by Dremio, Snowflake, and Microsoft. The summit aims to promote Apache Iceberg education and knowledge-sharing am

Re: [DISCUSS] REST: Way to query if metadata pointer is the latest

2025-01-06 Thread Jean-Baptiste Onofré
discussion still open, I think I'll move on to take care of > the changes required for the REST spec. Will send a PR for this soon. > > Regards, > Gabor > > > On Thu, Dec 12, 2024 at 4:07 PM Jean-Baptiste Onofré > wrote: >> >> Hi Gabor >> >> Tha

Re: There is no easy way to secure Iceberg data. How can we improve?

2025-01-02 Thread Jean-Baptiste Onofré
Hi Vladimir, Thanks for starting this discussion. I agree with you that the REST catalog "should" be the centralized security mechanism (Polaris is a good example). However, we have two challenges today: - there's no enforcement to use the REST catalog. Some engines are still directly accessing t

Re: 1.7.1 breaking change related to ADLS support

2024-12-19 Thread Jean-Baptiste Onofré
to revert. Let me know if >> there is something else I can do to help fix >> >> On Tue, Dec 17, 2024 at 8:35 AM Jean-Baptiste Onofré >> wrote: >> >>> Hi Cheng, >>> >>> Thanks for the update. The issue appeared in the test, but I guess it >

Re: [DISCUSS] Standardizing Error Handling in the Iceberg Spark Module

2024-12-19 Thread Jean-Baptiste Onofré
It sounds good, +1. Thanks ! Regards JB Le jeu. 19 déc. 2024 à 08:36, huaxin gao a écrit : > Hi everyone, > > While working on integrating Spark 4.0 with Iceberg, I noticed that error > conditions in the Spark module are primarily validated through the content > of error messages. I need to rev

Re: [VOTE] Release Apache Iceberg Rust 0.4.0 RC2

2024-12-18 Thread Jean-Baptiste Onofré
+1 (non binding) I checked: - hash and signature are OK - LICENSE and NOTICE look good - ASF header is present in all files - no binary file found in the source distribution - build is ok (I just had to fix an local issue with cargo but it was on my machine) Regards JB On Wed, Dec 18, 2024 at 2:

Re: [VOTE] Drop Hive runtime

2024-12-18 Thread Jean-Baptiste Onofré
+1 (non binding) I did a pass on the PRs and they look good to me. Thanks Manu ! Regards JB On Wed, Dec 18, 2024 at 2:59 AM Manu Zhang wrote: > > Hi all, > > Thanks for sharing your ideas in the discussion of Hive support[1]. We have a > consensus to drop Hive runtime and upgrade Hive metastor

Re: 1.7.1 breaking change related to ADLS support

2024-12-17 Thread Jean-Baptiste Onofré
h Iceberg. > > [1] https://github.com/apache/iceberg/pull/11731 > > Thanks, > Cheng Pan > > > > On Dec 17, 2024, at 17:58, Jean-Baptiste Onofré wrote: > > That works for me. In the meantime, I will draft a proposal (in terms > of content) for 1.8.0. > > I v

Re: REST catalog high availability

2024-12-17 Thread Jean-Baptiste Onofré
0 дек. 2024 г. в 00:57, Yufei Gu : >> >> Load balancing operates at a different layer than APIs, with various >> implementations available, such as etcd and Zookeeper. I’d prefer to avoid >> introducing additional complexity at the web service API level. >&g

Re: 1.7.1 breaking change related to ADLS support

2024-12-17 Thread Jean-Baptiste Onofré
ase since we don't want to leave the 1.7.x > version in a broken state for the ADLSFileIO. > > Kind regards, > Fokko > > Op di 17 dec 2024 om 07:40 schreef Jean-Baptiste Onofré : >> >> Hi Alex, >> >> It was exactly my concern (and question) when I did t

Re: 1.7.1 breaking change related to ADLS support

2024-12-16 Thread Jean-Baptiste Onofré
Hi Alex, It was exactly my concern (and question) when I did the review on the PR. I agree it's breaking change and definitely not good in a micro/patch release. As 1.7.0 is still working, I would propose to wait 1.8.0 if possible: I'm actually preparing a new thread with 1.8.0 proposal (in term

Re: [DISCUSS] Remove snapshot-id from IRC SetStatisticsUpdate

2024-12-16 Thread Jean-Baptiste Onofré
Hi, I saw the discussion on Slack. Yeah, it's redundant. I know some catalogs only consider the snapshot id in SetStatisticsUpdate. Regards JB On Fri, Dec 13, 2024 at 8:03 PM Christian wrote: > > Dear all, > > I believe we currently have a redundancy in the IRC SetStatisticsUpdate [1]. > SetSta

Re: [Discuss] Document Snapshot Summary Optional Fields for Standardization

2024-12-16 Thread Jean-Baptiste Onofré
Hi, yes I agree, I don't think we have to couple of spec version. Regards JB On Wed, Dec 11, 2024 at 11:17 PM Russell Spitzer wrote: > > I want to float this back up, I think this is a really good idea for cross > engine support. I don't think we have to tie this to any specific Spec > versio

Re: [DISCUSS] Spark Catalog - Drop vs Drop with Purge

2024-12-16 Thread Jean-Baptiste Onofré
It sounds good to me. Thanks ! Regards JB On Wed, Dec 11, 2024 at 7:20 PM Russell Spitzer wrote: > > Hi Y'all! > > Today we had a little discussion on the Apache Iceberg Catalog Community Sync > about DROP and DROP WITH PURGE. Currently the SparkCatalog implementation > inside of the reference l

Re: [DISCUSS] REST: Way to query if metadata pointer is the latest

2024-12-12 Thread Jean-Baptiste Onofré
ther. >> >> > > >> >> > > A couple of points to note: >> >> > > >> >> > > Both approaches would require changes to the "loadTable" endpoint. >> >> > > A minor advantage of HTTP caching is that it integrates seamlessly >>

Re: Contributor guidelines for becoming a committer

2024-12-11 Thread Jean-Baptiste Onofré
ake sure we get the > guidelines published before we make changes. > > I was NOT saying that we don't value non-code contributions today. We do, and > we made sure that the doc doesn't talk about only code. It refers to > contributions and reviews, not code. > >

Re: Contributor guidelines for becoming a committer

2024-12-11 Thread Jean-Baptiste Onofré
Hi Justin For reference, here's the original PR: https://github.com/apache/iceberg/pull/11670 I agree with you, and I commented in the PR about the same points: - considering non code contributions (also referring com dev page https://community.apache.org/pmc/adding-committers.html as "example")

Re: [DISCUSS] Apache Iceberg Summit 2025 - Selection Committee

2024-12-10 Thread Jean-Baptiste Onofré
Hi everyone, Thanks everyone for volunteering to help on the selection committee. We are Dec 10th, the call to volunteer is closed now. I'm gathering all volunteers from this thread to submit to the Iceberg PMC. Thanks again, Regards JB On Tue, Nov 26, 2024 at 10:42 AM Jean-Baptiste O

Re: REST catalog high availability

2024-12-09 Thread Jean-Baptiste Onofré
Hi Vladimir, As you said, today, it's possible to use a LB in front of multiple instances (using nginx, ELB, ...). I think it's pretty easy to setup and at "infrastructure" level. As it's possible to plug the HTTP5 client in Iceberg REST client, I think it's possible to inject PoolingHttpClientCo

Re: [PROPOSAL] Create Iceberg DockerHub repository

2024-12-09 Thread Jean-Baptiste Onofré
limited. > > Best > Piotr > > > > > On Fri, 22 Nov 2024 at 19:03, Jean-Baptiste Onofré wrote: >> >> Hi >> >> That's correct: in Sung's PR, I can see the secret.DOCKERHUB_USER and >> secret.DOCKERHUB_TOKEN. >> So, we should be a

Re: [VOTE] Release Apache Iceberg 1.7.1 RC1

2024-12-03 Thread Jean-Baptiste Onofré
+1 (non binding) Regards JB On Thu, Nov 21, 2024 at 2:35 PM Bryan Keller wrote: > > Hi Everyone, > > I propose that we release the following RC as the official Apache Iceberg > 1.7.1 release. > > The commit ID is 4a432839233f2343a9eae8255532f911f06358ef > * This corresponds to the tag: apache-i

Re: Storing catalog directly on object store

2024-11-26 Thread Jean-Baptiste Onofré
> back by Ashvin. > > https://docs.google.com/document/d/1yzLXSOtzBXyaWHfeVsWsMu4xmOH8rV6QyM5ZAnJZjMQ/edit?usp=drivesdk > > Thanks, > Vignesh. > > > On Tue, Nov 26, 2024, 11:59 AM Jean-Baptiste Onofré wrote: >> >> Hi Nikhil >> >> Thanks for your m

Re: Storing catalog directly on object store

2024-11-26 Thread Jean-Baptiste Onofré
Hi Nikhil Thanks for your message, very interesting. I think it would be great to involve the Polaris project here as well, as a REST Catalog implementation. The Polaris community is discussing storage/backend right now, so it would be the perfect timing to consider leveraging S3 conditional writ

[DISCUSS] Apache Iceberg Summit 2025 - Selection Committee

2024-11-26 Thread Jean-Baptiste Onofré
Hi everyone, As you probably know, we've been having discussions about the Iceberg Summit 2025. The PMC pre-approved the Iceberg Summit proposal, and one of the first steps is to put together a selection committee that will be responsible for choosing talks and guiding the process. Once we have a

Re: [ACTION REQUIRED] Removal of v3 artifact actions on December 5th

2024-11-25 Thread Jean-Baptiste Onofré
Hi Kevin I did a quick search and I have the same feedback as you: only iceberg-python is impacted. Thanks for the PR ! Regards JB On Mon, Nov 25, 2024 at 9:03 PM Kevin Liu wrote: > > Hey folks, > > I did a code search for both `actions/upload-artifact` and > `actions/download-artifact` in th

Re: [VOTE] Add Variant type to Iceberg Spec

2024-11-25 Thread Jean-Baptiste Onofré
I second Russell here. I think it makes sense to add variant type to V3 spec, even if the implementation details will come later. So +1 to add in the spec. Regards JB On Mon, Nov 25, 2024 at 6:21 PM Russell Spitzer wrote: > > I'm +1, > > 1. I don't think we are going to change our decision on w

Re: [PROPOSAL] Store Iceberg build scans on ASF Develocity (ge.apache.org)

2024-11-25 Thread Jean-Baptiste Onofré
.github.com/resources/github-actions-preventing-pwn-requests/ > > PS: Remote caching is a different topic. IIRC the Iceberg build has some > customizations that render Gradle's caching less efficient as it could be. > > > On 25.11.24 15:10, Jean-Baptiste Onofré wrote: >

Re: [PROPOSAL] Store Iceberg build scans on ASF Develocity (ge.apache.org)

2024-11-25 Thread Jean-Baptiste Onofré
ve it publish to Apache's infra." > > > On 25.11.24 12:39, Eduard Tudenhöfner wrote: > > I think that's a good idea, so +1 from my side. > > On Mon, Nov 25, 2024 at 11:10 AM Jean-Baptiste Onofré > wrote: >> >> Hi folks, >> >> The ASF is

[PROPOSAL] Store Iceberg build scans on ASF Develocity (ge.apache.org)

2024-11-25 Thread Jean-Baptiste Onofré
Hi folks, The ASF is hosting a Gradle Develocity instance where we can store our build scans. It's hosted on https://ge.apache.org. Several projects are using it (Apache Beam, Apache Pekko, etc). Build scans collect information about a build, including the actual output of failed tests. It's con

Re: [PROPOSAL] Create Iceberg DockerHub repository

2024-11-22 Thread Jean-Baptiste Onofré
ould be fully automated using GitHub and GitHub Actions. >> >> I’d love to hear everyone’s thoughts on this! >> >> Best regards, >> Kevin Liu >> >> >> On Fri, Nov 22, 2024 at 6:06 AM Jean-Baptiste Onofré >> wrote: >>> >>> Hi fo

Re: [PROPOSAL] Create Iceberg DockerHub repository

2024-11-22 Thread Jean-Baptiste Onofré
eberg. We should have a separate discussion on which >>> integration images Iceberg should officially support. >>> >>> For now, maintaining the REST catalog adapter image has already been >>> approved in earlier discussions, so let’s start with that. >>>

Re: [DISCUSS] Additional language implementations for Iceberg Puffin reader/writer

2024-11-22 Thread Jean-Baptiste Onofré
> Impala prefers to have its own implementation of things (like Parquet >>> reader/writer), so I have to double-check if it's acceptable >>> performance-wise to pull in a Rust implementation of anything. I don't have >>> any experience of doing so, henc

Re: [VOTE] Release Apache Iceberg 1.7.1 RC1

2024-11-22 Thread Jean-Baptiste Onofré
Hi Yufei, As discussed on the dev mailing list (with Fokko), the KEYS file to use is: https://dist.apache.org/repos/dist/release/iceberg/KEYS Regards JB On Fri, Nov 22, 2024 at 6:36 AM Yufei Gu wrote: > > Hi Bryan, > > This link seems broken, https://dist.apache.org/repos/dist/dev/iceberg/KEYS.

Re: [DISCUSS] PyIceberg 0.8.1 release

2024-11-21 Thread Jean-Baptiste Onofré
Hi Fokko It makes sense to me. Regards JB On Thu, Nov 21, 2024 at 9:14 AM Fokko Driesprong wrote: > > Hi everyone, > > I suggest following up on the PyIceberg 0.8.0 release with a patch release. > > Currently, we have two candidate bugfixes to be included: > > An issue where it falsely emits a

Re: [VOTE] Deprecate and remove last-column-id

2024-11-21 Thread Jean-Baptiste Onofré
+1 Regards JB On Tue, Nov 19, 2024 at 9:18 AM Fokko Driesprong wrote: > > Hi everyone, > > Based on the positive feedback on the [DISCUSS] thread and the pull-request > on GitHub, I would like to raise a vote to deprecate and remove the > last-column-id field from the spec. Since this is a spe

Re: [Discuss] Proposal to Adjust Catalog Sync Schedule & Cancel Next Wednesday’s Meeting

2024-11-21 Thread Jean-Baptiste Onofré
+1 for Wednesday 9am PST every 3 weeks. Thanks ! Regards JB On Thu, Nov 21, 2024 at 12:48 AM Honah J. wrote: > > Hi everyone, > > Thank you all for your participation in the catalog community sync so far! > I'm writing to discuss changes to the meeting schedule to better fit > everyone's avail

Re: [DISCUSS] Hive Support

2024-11-21 Thread Jean-Baptiste Onofré
Hi Manu It sounds like a plan. I think it makes sense to drop Hive 2 & 3 and encourage use of Hive 4 (mostly documentation task). Regards JB On Wed, Nov 20, 2024 at 7:19 AM Manu Zhang wrote: > > Okay, let me add this option > > D. Drop Hive 2 & 3 support and suggest to use built-in Iceberg supp

Re: [DISCUSS] Iceberg 1.7.1 release

2024-11-21 Thread Jean-Baptiste Onofré
Hi I think Hussein fix is a good candidate for 1.7.1 as it's bug introduced in 1.7.0. I'm +1 for a 1.7.1 at least including fixes mentioned by Bryan and also the fix from Hussein. Regards JB On Thu, Nov 21, 2024 at 2:00 PM Hussein Awala wrote: > > Hi Bryan, > > I think https://github.com/apache

Re: [DISCUSS] - Deprecate Equality Deletes

2024-11-19 Thread Jean-Baptiste Onofré
standardize on a view-based approach to >> handle CDC cases. >> Actually, it's already been explored in detail[1] by Jack before. >> >> [1] Improving Change Data Capture Use Case for Apache Iceberg >> <https://docs.google.com/document/d/1kyyJp4masbd1FrIKUHF1ED_z1

Re: [DISCUSS] Deprecate embedded manifests

2024-11-19 Thread Jean-Baptiste Onofré
Hi Fokko As I don’t think it’s actually used, I think it’s fine to deprecate it. Regards JB Le mar. 19 nov. 2024 à 12:32, Fokko Driesprong a écrit : > Hi everyone, > > I would like to propose to deprecate embedded manifests > . This has been used b

Re: [DISCUSS] - Deprecate Equality Deletes

2024-11-19 Thread Jean-Baptiste Onofré
esh. >> >> >> On Sat, Nov 9, 2024 at 2:17 AM Shani Elharrar >> wrote: >> >>> JB, this is what we do, we write Equality Deletes and periodically >>> convert them to Positional Deletes. >>> >>> We could probably index the keys, maybe

Re: [DISCUSS] Removal of last-column-id of public API

2024-11-18 Thread Jean-Baptiste Onofré
Hi Fokko I think it makes sense to deprecate and remote the field. +1 Regards JB On Thu, Nov 14, 2024 at 10:01 AM Fokko Driesprong wrote: > > Hi everyone, > > While reviewing the TableMetadataBuilder PR on Iceberg-Rust the other day, I > noticed that it exposes the last-column-id to the publi

Re: [DISCUSS] Iceberg 1.7.1 release

2024-11-15 Thread Jean-Baptiste Onofré
Hi Aihua I don't think we should include such changes in a micro release: micro release is for bug fixes, not adding things like data type. Just my $0.01 :) Regards JB On Fri, Nov 15, 2024 at 6:22 AM Aihua Xu wrote: > > Hi Bryan, > > I would like to include the following in 1.7.1 if possible.

Re: [DISCUSS] REST: Way to query if metadata pointer is the latest

2024-11-15 Thread Jean-Baptiste Onofré
st the community consider the > use cases seriously. We need a way forward. > > I’m also not too concerned about using metadata file paths to verify the > latest table version; clients can simply extract metadata filenames, which > include the UUID. > > Yufei > > > >

Re: [VOTE][Go] Release Apache Iceberg Go v0.1.0 RC2

2024-11-14 Thread Jean-Baptiste Onofré
Topol wrote: > > RC 2 has been uploaded! Sorry about that! > > > On Thu, Nov 14, 2024, 1:00 PM Jean-Baptiste Onofré wrote: >> >> Hi Matt >> >> Can you please update the source distribution on dist.apache.org >> (https://dist.apache.org/repos/dist/dev/icebe

Re: [VOTE][Go] Release Apache Iceberg Go v0.1.0 RC2

2024-11-14 Thread Jean-Baptiste Onofré
Hi Matt Can you please update the source distribution on dist.apache.org (https://dist.apache.org/repos/dist/dev/iceberg/) ? It's still the RC1 here. >From an ASF standpoint, that's the only strictly required artifact. Thanks ! Regards JB On Thu, Nov 14, 2024 at 12:06 AM Matt Topol wrote: > > H

[PROPOSAL] Create Iceberg DockerHub repository

2024-11-14 Thread Jean-Baptiste Onofré
Hi folks, While reviewing https://github.com/apache/iceberg/pull/11283, we discussed having a DockerHub repository for Iceberg. I can create this repository, similar to other Apache projects (like for example https://hub.docker.com/r/apache/activemq-classic, https://hub.docker.com/r/apache/airflo

Re: [DISCUSS] Spark 3.3 support?

2024-11-13 Thread Jean-Baptiste Onofré
+1 to deprecating and removing. Users can still use previous Iceberg versions if they need Spark 3.3.0 support. Regards JB On Wed, Nov 13, 2024 at 5:02 PM Anton Okolnychyi wrote: > > What do folks think about our Spark 3.3 support? Spark 3.3.0 was released in > June, 2022. Given the 18 month m

Re: [VOTE] Release Apache PyIceberg 0.8.0rc1

2024-11-13 Thread Jean-Baptiste Onofré
+1 (non binding) I checked: - Signature and hash are OK - ASF header present - LICENSE and NOTICE look good Thanks ! Regards JB On Thu, Nov 7, 2024 at 10:57 PM Kevin Liu wrote: > > Hi Everyone, > > I propose that we release the following RC as the official PyIceberg 0.8.0 > release. > > The co

Re: Optionally disable SSL verification for RESTCatalog

2024-11-13 Thread Jean-Baptiste Onofré
Hi Vladimir, Personally, even testing "local" REST catalogs, I'm setting up SSL certificates with a local CA, etc. It's not very painful. That said, I got your point, and I think we can update https://github.com/apache/iceberg/blob/main/core/src/main/java/org/apache/iceberg/rest/HTTPClient.java t

Re: Dynamic Flink Iceberg Sink

2024-11-13 Thread Jean-Baptiste Onofré
Thanks for the proposal! I will take a look asap. Regards JB On Mon, Nov 11, 2024 at 6:32 PM Péter Váry wrote: > > Hi Team, > > With Max Michels, we started to work on enhancing the current Iceberg Sink to > allow inserting evolving records into a changing table. > See: > https://docs.google.c

Re: [DISCUSS] Duplicate KEYS files

2024-11-13 Thread Jean-Baptiste Onofré
gt; https://downloads.apache.org/iceberg points to >>>> https://dist.apache.org/repos/dist/release/iceberg so we don't need to >>>> edit it by hand. >>>> >>>> It looks like the two files are different. For example, search for >>>> "Matt T

Re: [VOTE][Go] Release Apache Iceberg Go v0.1.0 RC0

2024-11-12 Thread Jean-Baptiste Onofré
-1 (non binding) I'm sorry Matt, but the LICENSE is not correct in the source distribution: it contains reference to gradle, parquet, etc. So, the LICENSE is probably a copy from iceberg java. I suggest to start from "regular" LICENSE and add the go section. The rest looks good: - NOTICE is OK -

Re: [DISCUSS] REST: Way to query if metadata pointer is the latest

2024-11-12 Thread Jean-Baptiste Onofré
Hi Fokko I like the idea, but I think it's more a workaround and could be confusing for users :) Regards JB On Tue, Nov 12, 2024 at 2:53 PM Fokko Driesprong wrote: > > Hey Gabor, > > Thanks for raising this. While reading this, my first thought is to leverage > the `tableExists` operation: > h

Re: [DISCUSS] REST: Way to query if metadata pointer is the latest

2024-11-12 Thread Jean-Baptiste Onofré
Hi Gabor, I think it's a bit related to the discussion about "partial metadata retrieval" we have (as you said). We don't yet have a consensus about this discussion and it's a pretty large proposal. I have a preference for isLatest() as it doesn't overlap with filtering table metadata (that we ca

Re: [DISCUSS] Duplicate KEYS files

2024-11-11 Thread Jean-Baptiste Onofré
Hi Fokko As we discussed about that together on Slack, I'm fine merging and removing the dev located KEYS file. Regards JB On Mon, Nov 11, 2024 at 4:13 PM Fokko Driesprong wrote: > > Hi everyone, > > While looking at the release steps for iceberg-go, I noticed that we have two > KEYS files: >

Re: [ANNOUNCE] Apache Iceberg release 1.7.0

2024-11-09 Thread Jean-Baptiste Onofré
Great ! Thanks Russell for driving this release ! Regards JB On Fri, Nov 8, 2024 at 4:33 PM Russell Spitzer wrote: > > I'm pleased to announce the release of Apache Iceberg 1.7.0! > > Apache Iceberg is an open table format for huge analytic datasets. Iceberg > delivers high query performance fo

Re: [DISCUSS] Add a implementation status page for iceberg

2024-11-09 Thread Jean-Baptiste Onofré
Hi, I like the idea. My only comment is probably to use versions instead of check marks, but all good :) Thanks ! Regards JB On Fri, Nov 8, 2024 at 3:33 PM Russell Spitzer wrote: > > Sounds like a great idea to me > > On Fri, Nov 8, 2024 at 7:58 AM Renjie Liu wrote: >> >> Hi: >> >> As iceberg

Re: [DISCUSS] - Deprecate Equality Deletes

2024-11-09 Thread Jean-Baptiste Onofré
column that's unique in a table and want to >>>>>>>>> delete a row by UUID. With position deletes each delete is expensive >>>>>>>>> without an index on that UUID. >>>>>>>>> With equality de

Re: [DISCUSS][Go] First release of iceberg-go

2024-11-08 Thread Jean-Baptiste Onofré
Hi Matt It sounds good to me. I will be happy to review the first release :) Regards JB On Fri, Nov 8, 2024 at 2:14 PM Matt Topol wrote: > > Hey all, > > With the merging of basic read support [1] among other features, I propose > we've hit a minimum threshold that it makes sense to do a v0.1.

Re: [DISCUSS] Partial Metadata Loading

2024-11-05 Thread Jean-Baptiste Onofré
Hi Szehon I agree with you there. I think it's better to move forward step by step, so Eduard's proposal is a good idea. However, I think it's worth to keep the discussion going, at least to shape a good proposal. Regards JB On Wed, Nov 6, 2024 at 3:23 AM Szehon Ho wrote: > > There seems to b

Re: [VOTE] Release Apache Iceberg 1.7.0 RC1

2024-11-05 Thread Jean-Baptiste Onofré
+1 (non binding) I checked: - Parquet has been updated - the planned Avro readers used in both Flink and Spark, they are actually used - Signature and hash are good - No binary file found in the source distribution - ASF header is present in all expected file - LICENSE and NOTICE look good - Build

Re: [Discuss] Iceberg View Interoperability

2024-11-04 Thread Jean-Baptiste Onofré
Hi Ajantha, During CommunityOverCode, I chatted with Matt Topol about Substrait and ADBC. I checked the Substrait support in DataFusion and it's interesting. I was thinking about where to actually store the Substrait plan (I was thinking about an intermediate SQL representation that we could stor

Re: [DISCUSS] REST: OAuth2 Authentication Guide

2024-11-04 Thread Jean-Baptiste Onofré
Hi Christian Nice document, thanks ! Definitely a great idea to "document" the OAuth2 flow. My only comment is that we should document the client side (what you are doing great in the doc), but also the server side (it might help to understand the full picture). I propose to have a group effort o

Re: [DISCUSS] Change Behavior for SchemaUpdate.UnionByName

2024-11-04 Thread Jean-Baptiste Onofré
Hi Rocco Thanks for bringing this discussion. In the context of multiple languages support (python, rust, go, java, ...), I'm more in favour of 1 (updating the docs). Implementations can deal with that. Regards JB On Thu, Oct 31, 2024 at 6:40 PM Rocco Varela wrote: > > Hi everyone, > > Apologi

Re: [DISCUSS] - Deprecate Equality Deletes

2024-10-31 Thread Jean-Baptiste Onofré
Hi Russell Thanks for the nice writeup and the proposal. I agree with your analysis, and I have the same feeling. However, I think there are more than Flink that write equality delete files. So, I agree to deprecate in V3, but maybe be more "flexible" about removal in V4 in order to give time to

Re: [VOTE] Release Apache Iceberg 1.7.0 RC0

2024-10-31 Thread Jean-Baptiste Onofré
+1 (non binding) I checked the LICENSE and NOTICE, and they both look good to me (the same as in previous releases), so not a blocker for me. I also checked: - the planned Avro readers used in both Flink and Spark, they are actually used - Signature and hash are good - No binary file found in the

Re: [VOTE] Deletion Vectors in V3

2024-10-29 Thread Jean-Baptiste Onofré
+1 (non binding) Regards JB On Tue, Oct 29, 2024 at 10:45 PM Anton Okolnychyi wrote: > > Hi folks, > > We have been discussing the new layout for position deletes in V3 for a while > now. It seems the community reached consensus. I'd like to start a vote on > adding deletion vectors to the V3

Re: [PROPOSAL] Refactore use of Guava Lists.*

2024-10-25 Thread Jean-Baptiste Onofré
; style around collection instantiation. > > Thanks > Eduard > > On Thu, Oct 24, 2024 at 8:20 AM Jean-Baptiste Onofré > wrote: >> >> Hi folks, >> >> We are using Guava for different "utils" methods. Especially, we are >> using Guava to crea

Re: [PROPOSAL] Refactore use of Guava Lists.*

2024-10-24 Thread Jean-Baptiste Onofré
re’s value here, but even if this were an arbitrary choice I >> don’t think there’s value in refactoring to avoid it. If and when these >> convenience methods are deprecated, we can always replace them with our own >> utility in iceberg-common and move on without a ton of change

Re: [DISCUSS] Apache Iceberg 1.7.0 Release Cutoff

2024-10-23 Thread Jean-Baptiste Onofré
iated if I can get more eyes on it. >> >> Thanks, >> Manu >> >> On Wed, Oct 23, 2024 at 11:03 PM Russell Spitzer >> wrote: >>> >>> Keep up coming :) I did a pass on Prashant's as well >>> >>> On Wed, Oct 23, 2024 at 12:47 A

[PROPOSAL] Refactore use of Guava Lists.*

2024-10-23 Thread Jean-Baptiste Onofré
Hi folks, We are using Guava for different "utils" methods. Especially, we are using Guava to create lists and maps. For instance, we do (and we force the use of): List myList = Lists.newArrayList(); or Map myMap = Maps.newHashMap(); If it was a good idea up to JDK7, these methods are now unne

Re: Overwrite old properties on table replace with REST catalog

2024-10-22 Thread Jean-Baptiste Onofré
time it highlights that engine developers are having hard >> time defining proper semantics for CREATE OR REPLACE in the Iceberg >> integrations, so a paragraph or so in the main Iceberg spec may help us >> align expectations. >> >> Regards, >> Vladimir. >> >

Re: [DISCUSS] Apache Iceberg 1.7.0 Release Cutoff

2024-10-22 Thread Jean-Baptiste Onofré
;> >> >> On Mon, Oct 21, 2024 at 8:56 AM Russell Spitzer >> wrote: >>> >>> That's still my current plan >>> >>> On Mon, Oct 21, 2024 at 10:52 AM Rodrigo Meneses wrote: >>>> >>>> Hi, team. Are we s

Re: [VOTE] Endpoint for refreshing vended credentials

2024-10-22 Thread Jean-Baptiste Onofré
Hi +1 (non binding) @Dmitri I don't think it's an issue: the endpoint can return multiple credentials, the client will take one. The credential refresh is not so costly though (and required to verify valid credential at retrieval time). Regards JB On Tue, Oct 22, 2024 at 5:09 PM Eduard Tudenhöf

  1   2   3   4   5   >