Re: Welcome Huaxin Gao as a committer!

2025-02-06 Thread Fokko Driesprong
Congratulations Huaxin! Op do 6 feb 2025 om 12:21 schreef Russell Spitzer : > Congratulations! > > On Thu, Feb 6, 2025 at 11:35 AM Péter Váry > wrote: > >> Congratulations! >> >> Matt Topol ezt írta (időpont: 2025. febr. 6., >> Cs, 10:40): >> >>> Congrats! Welcome! >>> >>> On Thu, Feb 6, 2025,

Re: guideline for interface change

2025-02-02 Thread Fokko Driesprong
Hey Aihua, Late to the party here, but the docs on deprecation can be found here: https://iceberg.apache.org/contribute/#semantic-versioning Hope this helps! Kind regards, Fokko Op zo 2 feb 2025 om 21:12 schreef Aihua Xu : > Thanks folks for suggestions. I will keep the existing one and mark it

Re: [VOTE] Release Apache Iceberg 1.7.2 rc0

2025-01-27 Thread Fokko Driesprong
93550 gpg: Good signature from "Fokko Driesprong " [unknown] gpg: WARNING: This key is not certified with a trusted signature! gpg: There is no indication that the signature belongs to the owner. Primary key fingerprint: FCD3 779E 399C 53D9 95FC 82A3 5171 BA3E 5449 3550 *➜

Re: [DISCUSS/VOTE] Add in ChangeLog Reserved Field IDs to Spec and Decrement Row Lineage Reserved IDs

2025-01-27 Thread Fokko Driesprong
+1 Op ma 27 jan 2025 om 10:54 schreef Honah J. : > +1, thanks for driving this! > > Best Regards, > Honah > > On Sun, Jan 26, 2025 at 3:20 PM Steven Wu wrote: > >> +1 >> >> On Sun, Jan 26, 2025 at 3:01 PM John Zhuge wrote: >> >>> +1 (non-binding) >>> >>> John Zhuge >>> >>> >>> On Sun, Jan 26, 2

Re: [VOTE] Release Apache Iceberg 1.7.2 rc0

2025-01-26 Thread Fokko Driesprong
Hey Ajantha, As mentioned in the 1.7.2 DISCUSS thread, we don't want to leave the 1.7.x version in a broken state. Since the 1.8.x ships with new features, it will require more thorough testing. I still think it would be good to get 1.7.2 to the public ASAP. Kind regards, Fokko Op ma 27 jan 2025

Re: [discuss] Standardizing Naming Schemes for Language-Specific Configurations

2025-01-24 Thread Fokko Driesprong
We formalized some of the configurations in the REST spec . Still, I would be reluctant to try to create an exhaustive list of configuration options since it will be tough to m

Re: [VOTE] REST API changes for freshness-aware table loading

2025-01-24 Thread Fokko Driesprong
+1 I think this is an elegant solution, thanks for working on this Gabor! Kind regards, Fokko Op vr 24 jan 2025 om 21:31 schreef Honah J. : > +1, thanks Gabor! > > Best regards, > Honah > > On Fri, Jan 24, 2025 at 12:30 PM Russell Spitzer < > russell.spit...@gmail.com> wrote: > >> +1 'd on the

Re: [VOTE] Add initial/write defaults to REST spec

2025-01-24 Thread Fokko Driesprong
+1 Op za 25 jan 2025 om 00:55 schreef Honah J. : > +1, thanks for driving this! > > Best Regards, > Honah > > On Fri, Jan 24, 2025 at 3:50 PM rdb...@gmail.com wrote: > >> +1 >> >> On Fri, Jan 24, 2025 at 2:25 PM Yufei Gu wrote: >> >>> +1 >>> Yufei >>> >>> >>> On Fri, Jan 24, 2025 at 2:15 PM Amo

Re: [DISCUSS] Apache Iceberg (java) 1.8.0 release

2025-01-24 Thread Fokko Driesprong
Added! +1 on the more frequent releases! Kind regards, Fokko Op za 25 jan 2025 om 02:01 schreef Dmitri Bourlatchkov : > Thanks for the clarification, Amogh! Much appreciated! > > Would you mind adding these PRs to the 1.9 milestone, please? > > https://github.com/apache/iceberg/pull/11992 > htt

Re: [DISCUSS, VOTE] OpenAPI Metadata Update for EnableRowLineage

2025-01-23 Thread Fokko Driesprong
+1 Thanks Russell Op do 23 jan 2025 om 18:47 schreef Aihua Xu : > + (non binding). > > Thanks Russell. > > On Thu, Jan 23, 2025 at 2:05 AM Jean-Baptiste Onofré > wrote: > >> +1 (non binding) >> >> Regards >> JB >> >> On Wed, Jan 22, 2025 at 11:51 PM Russell Spitzer >> wrote: >> > >> > Hey Y'al

Re: Very strange (AI generated) issues

2025-01-22 Thread Fokko Driesprong
useless takes a lot of time. This is part of the problem, that it takes a >> lot of energy and time to determine if those are valid or not - and with >> such a rate, it's not sustainable just to analyze whether they are good or >> bad. >> >> J. >> >>

Re: [VOTE] Document Snapshot Summary Optional Fields as Subsection of Appendix F in Spec

2025-01-22 Thread Fokko Driesprong
+1 Op wo 22 jan 2025 om 08:21 schreef Péter Váry : > +1 > > On Wed, Jan 22, 2025, 06:06 huaxin gao wrote: > >> +1 (non-binding) >> >> On Tue, Jan 21, 2025 at 6:04 PM Manu Zhang >> wrote: >> >>> +1 (non-binding) >>> >>> Thanks & Regards >>> >>> On Wed, Jan 22, 2025 at 8:06 AM Daniel Weeks wrote

Re: Very strange (AI generated) issues

2025-01-22 Thread Fokko Driesprong
Hey Jarek, Thanks for bringing this to our attention. When you talk about flooding, how many are we talking about? I see some suspicious issues (eg, here ), but not many. I hope this will come to a halt soon because it all additional work, and we als

Re: [VOTE] Deprecate IRC snapshot-id Field of SetStatisticsUpdate

2025-01-21 Thread Fokko Driesprong
+1 Thanks for cleaning this up Christian! Kind regards, Fokko Op di 21 jan 2025 om 08:25 schreef Christian Thiel < christian.t.b...@gmail.com>: > Hi everyone, > > based on good feedback on the [DISCUSS] thread [1] I would like to raise > a vote to deprecate the `snapshot-id` field of the `SetSt

Re: [Discuss][Vote] Spec Change - Add optional field added-rows to Snapshot for Row Lineage

2025-01-17 Thread Fokko Driesprong
+0, as I agree with Amogh, I think it would fit nicely with Honah's work of formalizing the properties. Kind regards, Fokko Op vr 17 jan 2025 om 08:55 schreef Honah J. : > +1 > > Best, > Honah > > On Thu, Jan 16, 2025 at 22:54 Manish Malhotra < > manish.malhotra.w...@gmail.com> wrote: > >> +1,

Re: [DISCUSS] Apache Iceberg 1.7.2 release

2025-01-16 Thread Fokko Driesprong
pass on GH issues and I will > >> run the release (I will ping you on Slack). > >> > >> Regards > >> JB > >> > >> On Tue, Jan 14, 2025 at 9:07 AM Fokko Driesprong > wrote: > >> > > >> > Good morning everyone, > >&g

Re: [DISCUSS] Remove snapshot-id from IRC SetStatisticsUpdate

2025-01-15 Thread Fokko Driesprong
;> >> [1] https://github.com/apache/iceberg-python/pull/1285 >> >> On Mon, Dec 16, 2024 at 5:30 AM Fokko Driesprong >> wrote: >> >>> Hey Christian, >>> >>> Great catch, I would also be in favor of removing the outer one. I don't >&g

Re: [VOTE] Document Snapshot Summary Optional Fields as Appendix in Spec

2025-01-15 Thread Fokko Driesprong
+1 Op wo 15 jan 2025 om 16:21 schreef Eduard Tudenhöfner < etudenhoef...@apache.org>: > +1 > > On Wed, Jan 15, 2025 at 1:20 AM rdb...@gmail.com wrote: > >> The content looks correct to me, but because this states a requirement >> ("Metrics must be accurate if written") I would rather move this c

Re: [DISCUSS] Apache Iceberg 1.7.2 release

2025-01-14 Thread Fokko Driesprong
7;m also fine either way for the release manager, I can tackle it > (with you) or you do, everything is fine for me :) > > Thanks ! > Regards > JB > > On Mon, Jan 13, 2025 at 7:24 AM Fokko Driesprong wrote: > > > > Hi everyone, > > > > Over the weekend

[DISCUSS] Apache Iceberg 1.7.2 release

2025-01-12 Thread Fokko Driesprong
Hi everyone, Over the weekend the last item from the 1.7.2 milestone has been resolved. I took the liberty of cherry-picking the commits to the 1.7.x branch

Re: [ANNOUNCE] Release Apache Iceberg Rust v0.4.0

2024-12-23 Thread Fokko Driesprong
Thanks for driving this release, Sung, and thanks to everyone who contributed! Kind regards, Fokko Op di 24 dec 2024 om 05:22 schreef Xuanwo : > Thank you for building this release! > > Really nice gift. > > On Tue, Dec 24, 2024, at 11:53, Sung Yun wrote: > > Hi all, > > > > The Apache Iceberg R

Re: [VOTE] Release Apache Iceberg Rust 0.4.0 RC3

2024-12-23 Thread Fokko Driesprong
+1 binding - Checked signature and checksum - Checked licenses - Ran test suite and verify.py - Did some local testing Thanks for running this release Sung, and anyone who contributed to this release. Very excited to see this next milestone for Iceberg-Rust. Kind regards, Fokko Op m

Re: [VOTE] Release Apache Iceberg Rust 0.4.0 RC2

2024-12-19 Thread Fokko Driesprong
Hey everyone, Unfortunately -1 (binding) from my end. It looks like we're able to read tables with merge-on-read deletes and we silently ignore them. This leads to data correctness issues and should be fixed before releasing it to the public. See #826

Re: [DISCUSS] Spec change to fix snapshot-logs reset issue when replacing tables in REST catalog

2024-12-19 Thread Fokko Driesprong
Hey Yuya, Thanks for raising this. When possible, I'd like to avoid additional flags to avoid confusion. For example, in the PR the purge flag is only taken into account when you remove the main ref. I would be leaning towards keeping the snapshot-log, instead of purging it. The snapshot-log will

Re: [VOTE] Release Apache Iceberg Rust 0.4.0 RC2

2024-12-18 Thread Fokko Driesprong
Hey Kevin, Ran into the same thing :) Currently, the tests don't support Docker , I've switched to Podman and it works like a charm. Kind regards, Fokko Op wo 18 dec 2024 om 16:46 schreef Kevin Liu : > Hey Sung, > > Thanks fo

Re: REST catalog high availability

2024-12-18 Thread Fokko Driesprong
Hey Vladimir, Thanks for raising this thread. I'm also reluctant to add this to the application layer. We would also need to support this with the other clients that are out there. Did you give JB's suggestion around the PoolingHttpClientConnectionManager a try? Kind regards, Fokko Op di 17 dec

Re: ​[discuss] Allow 200 responses for HEAD requests in REST API

2024-12-18 Thread Fokko Driesprong
Hey Kevin, I also agree with Yufei. For PyIceberg we had a long list of issues around the head request (#1363 gives a nice overview) to check if the table is there (and that also has just been added to Java

Re: 1.7.1 breaking change related to ADLS support

2024-12-17 Thread Fokko Driesprong
I took the liberty of creating a 1.7.2 milestone: https://github.com/apache/iceberg/milestone/52 Kind regards, Fokko Op di 17 dec 2024 om 09:58 schreef Fokko Driesprong : > Thanks for raising this Alex, > > I suggest doing a 1.7.2 patch release since we don't want to leave the &

Re: 1.7.1 breaking change related to ADLS support

2024-12-17 Thread Fokko Driesprong
Thanks for raising this Alex, I suggest doing a 1.7.2 patch release since we don't want to leave the 1.7.x version in a broken state for the ADLSFileIO. Kind regards, Fokko Op di 17 dec 2024 om 07:40 schreef Jean-Baptiste Onofré : > Hi Alex, > > It was exactly my concern (and question) when I d

Re: [DISCUSS] Remove snapshot-id from IRC SetStatisticsUpdate

2024-12-16 Thread Fokko Driesprong
Hey Christian, Great catch, I would also be in favor of removing the outer one. I don't see any value in having them both. Kind regards, Fokko Op ma 16 dec 2024 om 14:26 schreef Jean-Baptiste Onofré : > Hi, > > I saw the discussion on Slack. Yeah, it's redundant. > I know some catalogs only con

Re: [Discuss] Document Snapshot Summary Optional Fields for Standardization

2024-12-16 Thread Fokko Driesprong
I'm in favor of this as well. While working on PyIceberg I had to deduce this from the Java code, having a more condensed version in the appendix of the spec would be great. Kind regards, Fokko Op ma 16 dec 2024 om 14:21 schreef Jean-Baptiste Onofré : > Hi, > > yes I agree, I don't think we have

Re: New committer: Scott Donnelly

2024-12-11 Thread Fokko Driesprong
Congratulations Scott! Kind regards, Fokko Op wo 11 dec 2024 om 11:56 schreef Manu Zhang : > Congratulations Scott! > > Thanks, > Manu > > On Wed, Dec 11, 2024 at 3:21 PM Eduard Tudenhöfner < > etudenhoef...@apache.org> wrote: > >> Congrats Scott! >> >> On Wed, Dec 11, 2024 at 7:35 AM roryqi wr

Re: [Discussion] Maintain vendor neutrality on the quickstart page

2024-12-10 Thread Fokko Driesprong
di 10 dec 2024 om 11:48 schreef Ajantha Bhat : > That's a good suggestion Fokko. > It would avoid maintaining one more docker image. We can update the > quickstart to use the docker image provided by Spark. > > - Ajantha > > On Tue, Dec 10, 2024 at 4:08 PM Fokko Drie

Re: [Discussion] Maintain vendor neutrality on the quickstart page

2024-12-10 Thread Fokko Driesprong
Hey Ajantha, Thanks for bringing this up, we should both remove the vendor reference and bring this back up to date. My preference would be to rely on the Spark image provided by the Apache Spark project, similar to what we do for the Hive

New committer: Matt Topol

2024-12-10 Thread Fokko Driesprong
in our project community. Fokko Driesprong On behalf of the Iceberg PMC

Re: [Proposal] Automating the PyIceberg Release Process

2024-12-04 Thread Fokko Driesprong
o push the artifacts. I can generate a new PyPI token using > my own PyPI account, or we could request one from ASF Infra. > > Thanks again for your help! > > Best, > Kevin Liu > > [1] https://reproducible-builds.org/ > [2] https://github.com/wimglenn/setuptools-reproducibl

Re: [VOTE] Release Apache PyIceberg 0.8.1rc1

2024-12-03 Thread Fokko Driesprong
+1 (binding) Checked checksums, signatures, and licenses. Honah, there are some open PRs to bump to the latest dependencies (e.g. Pandas 2.2.3 ), except for the warning, everything works well. Would be good to get those bumped at some point :)

Re: [Proposal] Automating the PyIceberg Release Process

2024-12-03 Thread Fokko Driesprong
Hey Kevin, First of all, thanks for working on the releases, that's always much appreciated. Regarding the changes to the release process, I'm all for automating as much as possible, but I have some concerns. I also think it is important to split out nightly builds, and the release process in gen

Re: [VOTE] Release Apache Iceberg 1.7.1 RC1

2024-12-02 Thread Fokko Driesprong
Hey Bryan, Thanks for running the release! +1 binding from my end. Checked signature and checksum, ran license checks, and built against JDK17. Kind regards, Fokko Op di 3 dec 2024 om 08:36 schreef Driesprong, Fokko : > Hey Bryan, > > Thanks for running the release! +1 binding from my end. > >

Re: [DISCUSS] Deprecate embedded manifests

2024-11-27 Thread Fokko Driesprong
27;t allow it in v2. But I'm not sure how that would allow us > to remove code paths associated with it. If it is allowed by an older and > supported version of the spec, then how can we safely remove the code paths > that read it? > > On Fri, Nov 22, 2024 at 2:56 AM Fokko Dri

Re: [DISCUSS] Hive Support

2024-11-27 Thread Fokko Driesprong
Hey Cheng, Thanks for the suggestion. The nightly snapshots are available: https://repository.apache.org/content/groups/snapshots/org/apache/iceberg/iceberg-core/, which might help when working on features that are not released yet (eg Nanosecond timestamps). Besides that, we should run RCs agains

Re: [DISCUSS] iceberg rust 0.4.0 and iceberg pyiceberg_core 0.1.0 release

2024-11-27 Thread Fokko Driesprong
Hey Sung, All for it, and happy to help as well. I'll add it to the agenda for tomorrow's Rust sync . We'll make sure to publish the notes since it is on a US holiday. Kind regards, Fokko Op wo 27 nov 2024 om 19:30 schreef Kevin L

Re: [DISCUSS] Apache Iceberg Summit 2025 - Selection Committee

2024-11-26 Thread Fokko Driesprong
Hey JB, Thanks for organizing this. Happy to help! Kind regards, Fokko Op wo 27 nov 2024 om 06:23 schreef karuppayya : > Hi JB, I am happy to help with this. > - Karuppayya > > On Tue, Nov 26, 2024 at 8:55 PM Renjie Liu > wrote: > >> Hi, JB: >> >> Thanks for driving this. Happy to help! >> >>

Re: [VOTE] Deprecate and remove last-column-id

2024-11-25 Thread Fokko Driesprong
;> >> Regards >> JB >> >> On Tue, Nov 19, 2024 at 9:18 AM Fokko Driesprong >> wrote: >> > >> > Hi everyone, >> > >> > Based on the positive feedback on the [DISCUSS] thread and the >> pull-request on GitHub, I would like

Re: [PROPOSAL] Create Iceberg DockerHub repository

2024-11-22 Thread Fokko Driesprong
. >> > We already found a bug in PyIceberg [1] from integrating with the TCK >> docker image. It would be great to have a nightly build, perhaps we can set >> up a Github Action to automate the docker image publishing. >> > >> > Best, >> > Kevin Liu &g

Re: [DISCUSS] Deprecate embedded manifests

2024-11-22 Thread Fokko Driesprong
gt; of concurrent writes. > > I've written down my proposal here: > https://lists.apache.org/thread/4cm9kc6pkmx5ol218z5yjk41gh9t28qg > > And I thought I share it with you before you decide to deprecate the > manifests field. > > Kind regards, > > Jan > On 22.11.24

Re: [DISCUSS] Proposal to buffer manifest files before updating manifest-list

2024-11-22 Thread Fokko Driesprong
Hi Jan, Thanks for sending out this proposal. While reading through it, two questions pop up: - You mentioned repurposing the manifests field. Currently, this field contains a list of paths that point to the manifest data. Would this also be your suggestion? This way, when committing the

Re: [DISCUSS] Deprecate embedded manifests

2024-11-22 Thread Fokko Driesprong
> wrote: > >> +1, great to have less possible paths. >> >> Thanks >> Szehon >> >> On Thu, Nov 21, 2024 at 10:33 AM Steve Zhang >> wrote: >> >>> +1 to deprecate >>> >>> Thanks, >>> Steve Zhang >>> >&

Re: [DISCUSS] Hive Support

2024-11-22 Thread Fokko Driesprong
I agree with Péter, that sounds like the right approach to me as well. Kind regards, Fokko Op vr 22 nov 2024 om 07:38 schreef Péter Váry : > I would prefer B, and only revert to A if we find that B becomes too > complicated. > > On Fri, Nov 22, 2024, 04:26 Manu Zhang wrote: > >> Hi Peter, >> >>

Re: [DISCUSS] Additional language implementations for Iceberg Puffin reader/writer

2024-11-22 Thread Fokko Driesprong
C++ >> implementation of the Iceberg lib in general for their C++ engine. cc @Gang >> Wu >> >> There seemed to be general support from the community to start up such a >> sub-project, so I'm reaching out now to ask for some guidance so that we >> can get goi

Re: [VOTE] Deprecate and remove last-column-id

2024-11-21 Thread Fokko Driesprong
llowing minor release (1.8.x), the removal from the code is staged for the next minor (1.9.x) or major release (2.x.x), and the removal from the spec is planned for (2.x.x). Hope this clarifies. Kind regards, Fokko Driesprong Op di 19 nov 2024 om 10:45 schreef Manu Zhang : > Thanks Fokko.

[DISCUSS] PyIceberg 0.8.1 release

2024-11-21 Thread Fokko Driesprong
Hi everyone, I suggest following up on the PyIceberg 0.8.0 release with a patch release. Currently, we have two candidate bugfixes to be included: - An issue where it falsely emits a warning when loading a table. - Another issue

[DISCUSS] Deprecate embedded manifests

2024-11-19 Thread Fokko Driesprong
y deprecate them from the spec. It is only supported by Iceberg Java today, and I haven't seen any requests for PyIceberg to add support for this. Any questions or concerns about deprecating the embedded manifests? Kind regards, Fokko Driesprong

[VOTE] Deprecate and remove last-column-id

2024-11-19 Thread Fokko Driesprong
Hi everyone, Based on the positive feedback on the [DISCUSS] thread and the pull-request on GitHub , I would like to raise a vote to deprecate and remove the last-column-id field from

Re: [VOTE] Release Apache PyIceberg 0.8.0rc2

2024-11-15 Thread Fokko Driesprong
+1 binding Thanks for running this release! Checked the signatures, checksums, and licenses. Kind regards, Fokko Op vr 15 nov 2024 om 14:52 schreef Sung Yun : > Hi Kevin, > > Thank you again for running this release! > > I've verified the License headers, checksums and signatures. > > Downloade

Re: [VOTE][Go] Release Apache Iceberg Go v0.1.0 RC2

2024-11-15 Thread Fokko Driesprong
+1 (binding) Verified checksums, signatures, and tests with the validation script using the Apache dist (thanks Kevin!). Did some checks on the dependencies and looks good. Kind regards, Fokko Op vr 15 nov 2024 om 06:48 schreef Eduard Tudenhöfner < etudenhoef...@apache.org>: > +1 (binding) >

Re: Re: [Proposal] Replicating version-hint onto the file system

2024-11-15 Thread Fokko Driesprong
Hey Ashvin, Thanks for taking the time to write up the proposal. I have one big question that we need to clarify first. Many implementations out there today expect that the location is unique to the table, but this isn't called out in the spec exp

Re: [PROPOSAL] Create Iceberg DockerHub repository

2024-11-15 Thread Fokko Driesprong
+1 — excited to see this happen! For the TCK, I think we can release this with the Java together, and have a nightly build (tag the container with nightly Dockerhub). This way we can already test out (and start implementing) the new features in the related projects. Thoughts on that? Regarding th

Re: [DISCUSS] REST: Way to query if metadata pointer is the latest

2024-11-15 Thread Fokko Driesprong
le version; clients can simply extract metadata filenames, which > include the UUID. > > Yufei > > > > > On Tue, Nov 12, 2024 at 7:46 AM Jean-Baptiste Onofré > wrote: > > Hi Fokko > > I like the idea, but I think it's more a workaround an

[DISCUSS] Removal of last-column-id of public API

2024-11-14 Thread Fokko Driesprong
Hi everyone, While reviewing the TableMetadataBuilder PR on Iceberg-Rust the other day, I noticed that it exposes the last-column-id to the public API, but I believe there is no need for it. This field is used to determine th

Re: [DISCUSS] Spark 3.3 support?

2024-11-13 Thread Fokko Driesprong
+1 to deprecating and removing it Kind regards, Fokko Op wo 13 nov 2024 om 18:02 schreef Jean-Baptiste Onofré : > +1 to deprecating and removing. > > Users can still use previous Iceberg versions if they need Spark 3.3.0 > support. > > Regards > JB > > On Wed, Nov 13, 2024 at 5:02 PM Anton Okoln

Re: [DISCUSS] Duplicate KEYS files

2024-11-12 Thread Fokko Driesprong
pache/iceberg/pull/11526 > > We will probably want to merge and remove the dev KEYS first. > > Thanks, > Kevin Liu > > On Mon, Nov 11, 2024 at 11:52 PM Jean-Baptiste Onofré > wrote: > > Hi Fokko > > As we discussed about that together on Slack, I'

Re: [DISCUSS] Duplicate KEYS files

2024-11-12 Thread Fokko Driesprong
Looks like dev/KEYS ⊂ release/KEYS: https://www.diffchecker.com/4oxGhphl/ I've removed the old one. Thanks, everyone, and thanks Kevin for raising the PRs, let's get those in! Kind regards, Fokko Op di 12 nov 2024 om 20:03 schreef Fokko Driesprong : > There is a consensus on merg

Re: [DISCUSS] REST: Way to query if metadata pointer is the latest

2024-11-12 Thread Fokko Driesprong
Hey Gabor, Thanks for raising this. While reading this, my first thought is to leverage the `tableExists` operation: https://github.com/apache/iceberg/blob/e3f39972863f891481ad9f5a559ffef093976bd7/open-api/rest-catalog-open-api.yaml#L1129-L1160 This doesn't return anything today, but we could ret

[DISCUSS] Duplicate KEYS files

2024-11-11 Thread Fokko Driesprong
Hi everyone, While looking at the release steps for iceberg-go , I noticed that we have two KEYS files: - https://dist.apache.org/repos/dist/dev/iceberg/KEYS - https://dist.apache.org/repos/dist/release/iceberg/KEYS (Also availa

Re: [VOTE] Release Apache PyIceberg 0.8.0rc1

2024-11-09 Thread Fokko Driesprong
+1 (binding) Thanks for running this release Kevin! - Verified signatures and checksum - Checked for licenses - Installed and ran tests - Did some local testing Kind regards, Fokko Op za 9 nov 2024 om 00:01 schreef Drew : > +1 (non-binding) > > - verified signature and checksum > - verified RA

Re: [VOTE] Iceberg Rust Sync Meeting Time

2024-11-08 Thread Fokko Driesprong
l send a google calendar event later. >> >> On Mon, Oct 28, 2024 at 8:40 PM Fokko Driesprong >> wrote: >> >>> Hey Renjie, >>> >>> Thanks for organizing this! Both times are very friendly for folks from >>> Europe, so I'

Re: [VOTE] Release Apache Iceberg 1.7.0 RC1

2024-11-07 Thread Fokko Driesprong
Thanks Russel for running this release! +1 (binding) Checked signatures, checksum, licenses and did some local testing. Kind regards, Fokko Op do 7 nov 2024 om 08:35 schreef Eduard Tudenhöfner < etudenhoef...@apache.org>: > +1 (binding) > > Verified signature/checksum/license and build/test wi

Re: [Discuss] Iceberg View Interoperability

2024-11-04 Thread Fokko Driesprong
terop with the > serialized Coral IR. Not sure if it makes sense to have all front-end and > back-end implementations (e.g., Spark to Coral IR or Coral IR to Trino, > etc) be reimplemented in those languages. Such implementations actually > depend on the reuse of the native parsers of those

Re: [VOTE] Release Apache Iceberg 1.7.0 RC0

2024-11-01 Thread Fokko Driesprong
Thanks Russel for running this release! +1 (binding) Checked signatures, checksum, licenses and did some local testing. Kind regards, Fokko Op do 31 okt 2024 om 17:47 schreef Russell Spitzer < russell.spit...@gmail.com>: > @Manu Zhang You are definitely right, I'll get > that in before I do

Re: [DISCUSS] Change Behavior for SchemaUpdate.UnionByName

2024-11-01 Thread Fokko Driesprong
Hey Rocco, Thanks for raising this. I don't have any strong feelings about this, and I agree with Russell that it should not throw an exception. I guess there was no strong reason behind how it is today, but it's just because we leverage the UpdateSchema API, which raises an exception when doing

Re: [VOTE] Deletion Vectors in V3

2024-10-30 Thread Fokko Driesprong
+1 I had to read up a bit, thanks for driving this Anton. Kind regards, Fokko Op do 31 okt 2024 om 07:53 schreef Piotr Findeisen < piotr.findei...@gmail.com>: > Thank you Anton, > > +1 (non-binding) > > > > On Thu, 31 Oct 2024 at 05:07, John Zhuge wrote: > >> +1 (non-binding) >> >> On Wed, Oct

Re: [DISCUSS] Discrepancy Between Iceberg Spec and Java Implementation for Snapshot summary's 'operation' key

2024-10-29 Thread Fokko Driesprong
>>> >>>>> This is one of the reasons I'm opposed to metadata we don't use/need. >>>>> We end up forking the spec and then we have some odd behaviors, a metadata >>>>> which is illegal in one implementation (PyIceberg) will be legal

Re: Iceberg python library sync

2024-10-28 Thread Fokko Driesprong
Thanks Jun for the heads up. Looking forward to seeing everyone tomorrow! Kind regards, Fokko Op do 24 okt 2024 om 08:35 schreef Jun H. : > Hi everyone, > > FYI, the next community python library sync meeting will be on Tuesday > (10/29/2024) at 9 AM (US/Pacific). Here is the meeting agenda for

Re: [Discuss] Iceberg View Interoperability

2024-10-28 Thread Fokko Driesprong
Hey everyone, Views in PyIceberg are not yet as mature as in Java, mostly because tooling in Python tends to work with data frames, rather than SQL. I do think it would be valuable to extend support there. I have a bit of experience in turning SQL into ASTs and extending grammar, and I'm confiden

Re: [VOTE] Iceberg Rust Sync Meeting Time

2024-10-28 Thread Fokko Driesprong
Hey Renjie, Thanks for organizing this! Both times are very friendly for folks from Europe, so I'm fine with both as well. Helpful link to project this onto your timezone: https://www.worldtimebuddy.com/?qm=1&lid=128,2759794,5391959&h=128&date=2024-10-28&sln=23-24&hf=0

Re: [DISCUSS] Discrepancy Between Iceberg Spec and Java Implementation for Snapshot summary's 'operation' key

2024-10-28 Thread Fokko Driesprong
Hey everyone, Thanks Kevin for spotting and raising this. I agree with Ryan that we still should be able to read the table. We had similar issues in the past (looking at you -1 for current-snapshot-id) which we can fix. Can I suggest when we encounter a missing operation, and we write the table a

Re: [DISCUSS] iceberg-rust: pyiceberg_core 0.1.0 Release

2024-08-28 Thread Fokko Driesprong
Thanks for driving this Sung, this is very exciting! 1. The transforms are a good first thing to address. 2. I agree with Xuanwo, that for flexibility we can decouple them. 3. Automation is probably easier than doing it manually (otherwise we would have to document the steps). Kind regards, Fokko

Re: [VOTE] Merge REST Spec change to add RemovePartitionSpecsUpdate update type

2024-08-26 Thread Fokko Driesprong
+1 Op ma 26 aug 2024 om 22:00 schreef Yufei Gu : > +1 > Yufei > > > On Mon, Aug 26, 2024 at 11:06 AM Ryan Blue > wrote: > >> +1 >> >> On Mon, Aug 26, 2024 at 11:04 AM Amogh Jahagirdar <2am...@gmail.com> >> wrote: >> >>> I've opened a PR [1] to add a RemovePartitionSpecsUpdate update type so >>>

Re: [VOTE] Release Apache Iceberg 1.6.1 RC2

2024-08-24 Thread Fokko Driesprong
+1 (binding) - Verified signatures, checksums and ran the tests locally Kind regards, Fokko Op vr 23 aug 2024 om 20:51 schreef Piotr Findeisen < piotr.findei...@gmail.com>: > +1 (non-binding) > > Trino integration > > https://github.com/trinodb/trino/actions/runs/10529992246/job/29179087096?pr=

Re: [DISCUSS] Variant Spec Location

2024-08-22 Thread Fokko Driesprong
ndation. > >>>>>>>>>> > >>>>>>>>>> Yufei > >>>>>>>>>> > >>>>>>>>>> On Wed, Aug 14, 2024 at 7:51 PM Gang Wu > >>> wrote: > >>>>>>&g

Re: [VOTE] Release Apache Iceberg 1.6.1 RC1

2024-08-21 Thread Fokko Driesprong
Hey Eduard, I think it relates to this PR. It contains a CVE and would be good to be backported. We wanted to include it in 1.6.1 if we needed another RC, but that didn't happen, so I think we didn't cherry-pick it to 1.6.x branch. Kind regards, Fokk

Re: Type promotion in v3

2024-08-20 Thread Fokko Driesprong
ot something that needs to be done now but laying the ground-work > is useful). Similar to the point above we should be opinionated about this. For example, historically we've been parsing dates strictly, as an example, see DateTimeUtil <https://github.com/apache/iceberg/blob/main/api/s

Re: [DISCUSS] Adding RemovePartitionSpecsUpdate update type to REST

2024-08-20 Thread Fokko Driesprong
+1 Thanks for working on this Op di 20 aug 2024 om 04:16 schreef xianjin : > +1 from my side as well. > > Sent from my iPhone > > On Aug 20, 2024, at 9:09 AM, Yufei Gu wrote: > >  > > +1, the new spec looks good to me. It seems like the client-side handling > the heavy lifting of figuring out w

Re: [VOTE] Spec changes in preparation for v3

2024-08-19 Thread Fokko Driesprong
+1 Op ma 19 aug 2024 om 22:01 schreef Russell Spitzer < russell.spit...@gmail.com>: > +1 - Feels duplicative to vote here and approve on the PR > > On Mon, Aug 19, 2024 at 2:41 PM Ryan Blue wrote: > >> Hi everyone, >> >> I'd like to vote on PR #10948 >>

Re: [DISCUSS] Iceberg 1.6.1 release

2024-08-19 Thread Fokko Driesprong
erged on Jul 26th and it would be >> great to make it available to downstream projects. >> >> I volunteer to help with Iceberg 1.6.1 release, to share the operational >> cost. >> >> >> Best >> Piotr >> >> >> >> On Thu, 8 Aug 202

Re: Type promotion in v3

2024-08-19 Thread Fokko Driesprong
Thanks Ryan for bringing this up, that's an interesting problem, let me think about this. we can persist schema_id in the DataFile This was also my first thought. The two drawbacks are: - Distribute all the schemas to the executors, and we have to do the lookup and comparison there. -

Re: Table schema and partition spec update

2024-08-19 Thread Fokko Driesprong
Hey Peter, Thanks for raising this since I recently ran into the same issue. The APIs that we have today nicely hide the field IDs from the user, which is great. I do think all the methods are in there to evolve the schema to the desired one, however, we don't have a way to control the field-IDs.

Re: [VOTE] Release Apache Iceberg Rust 0.3.0 RC1

2024-08-19 Thread Fokko Driesprong
+1 (binding) Thanks Xuanwo for running this release, and sorry for the late vote, I was doing additional tests against Tabular and had to flex my tiny Rust muscle a bit. - Validated the signatures and checksums - Checked out the licenses

Re: [VOTE] Release Apache PyIceberg 0.7.1rc2

2024-08-14 Thread Fokko Driesprong
+1 (binding) Thanks Sung for running this 🙌 - Validated signatures/checksums/license - Ran some basic tests (3.10) Kind regards, Fokko Op wo 14 aug 2024 om 19:57 schreef André Luis Anastácio : > >- validated signatures and checksums > > >- checked license > > >- ran tests and test-

Re: [DISCUSS] Cleanup svn dev/iceberg

2024-08-14 Thread Fokko Driesprong
Thanks! Kind regards, Fokko Op wo 14 aug 2024 om 17:57 schreef Xuanwo : > Got it. I will clean them up. > > On Wed, Aug 14, 2024, at 23:54, Fokko Driesprong wrote: > > Hey Xuanwo, > > Feel free to clean those up as they should have been cleaned up a long > time ago.

Re: [DISCUSS] Cleanup svn dev/iceberg

2024-08-14 Thread Fokko Driesprong
Hey Xuanwo, Feel free to clean those up as they should have been cleaned up a long time ago. I'm also happy to do it myself, let me know! Kind regards, Fokko Op wo 14 aug 2024 om 17:49 schreef Xuanwo : > Hi, > > The dev branch of SVN is used to host artifacts awaiting a vote. It > increases the

Re: [VOTE] Merge REST spec clarification on how servers should handle unknown updates/requirements

2024-08-14 Thread Fokko Driesprong
+1 Thanks for clarifying this Kind regards, Fokko Op wo 14 aug 2024 om 04:34 schreef xianjin : > +1 > > On Aug 14, 2024, at 2:24 AM, Ryan Blue > wrote: > >  > +1 > > On Tue, Aug 13, 2024 at 8:59 AM Yufei Gu wrote: > >> +1 >> Yufei >> >> >> On Tue, Aug 13, 2024 at 8:57 AM Eduard Tudenhöfner <

Re: [DISCUSS] Variant Spec Location

2024-08-14 Thread Fokko Driesprong
+1 to what's already being said here. It is good to copy the spec to Iceberg and add context that's specific to Iceberg, but at the same time, we should maintain compatibility. Kind regards, Fokko Op wo 14 aug 2024 om 15:30 schreef Manu Zhang : > +1 to copy the spec into our repository. I think

Re: Welcome Péter, Amogh and Eduard to the Apache Iceberg PMC

2024-08-14 Thread Fokko Driesprong
Congratulations and welcome! Kind regards, Fokko Op wo 14 aug 2024 om 06:23 schreef Xuanwo : > Congrats! Thanks for your contribution. > > On Wed, Aug 14, 2024, at 11:32, Renjie Liu wrote: > > Congratulations, everyone! > > On Wed, Aug 14, 2024 at 11:14 AM roryqi wrote: > > Congrats! > > Steven

Re: [DISCUSS] Start iceberg-rust 0.3.0 release process

2024-08-14 Thread Fokko Driesprong
Thanks Xuanwo for driving this, very excited to see this happening. Let me know if there is anything I can help with! Kind regards, Fokko Op wo 14 aug 2024 om 08:58 schreef Xuanwo : > Hello, everyone > > I'm starting this thread to discuss initiating the release process for > iceberg-rust 0.3.0.

Re: [DISCUSS] Filesystem in PyIceberg

2024-08-12 Thread Fokko Driesprong
Hi André, First of all, thanks for raising this. Maintenance routines are a long-awaited functionality in PyIceberg. The FileIO concept is not limited to PyIceberg, but is also present in Java

Re: [DISCUSS] Flink 1.20: make FLIP-27 default in SQL and mark the old FlinkSource as deprecated

2024-08-12 Thread Fokko Driesprong
Hey Steven, That sounds very exciting! I'm not a heavy Flink user, but I don't see any issues enabling it on Flink 1.20. We should make it explicit in the changelog, and if possible give some hints on how to drain the Flink jobs. Kind regards, Fokko Op ma 12 aug 2024 om 04:57 schreef Steven Wu :

Re: [DISCUSS] PyIceberg 0.7.1 release

2024-08-09 Thread Fokko Driesprong
b.com/apache/iceberg-python/pull/1026 > > Sung > > On Thu, Aug 8, 2024 at 9:29 AM André Luis Anastácio > wrote: > >> I fixed an overwrite error that, I think, would be good to include in the >> 0.7.1 release https://github.com/apache/iceberg-python/pull/1023 >>

Re: [DISCUSS] Iceberg 1.6.1 release

2024-08-08 Thread Fokko Driesprong
Hey Piotr, We had some delays with the Avro 1.12.0 release, mostly because all the languages were released at once. On the Avro devlist, I suggested releasing 1.11.4 just for Java because of the CVE. Realistically this would be around 1-2 weeks. Does that sound reasonable? Kind regards, Fokko Op

  1   2   3   >