Re: Welcome Huaxin Gao as a committer!

2025-02-06 Thread Yufei Gu
Congrats Huaxin! Yufei On Thu, Feb 6, 2025 at 9:09 AM Steve Zhang wrote: > Congratulations Huaxin, well deserved! > > Thanks, > Steve Zhang > > > > On Feb 6, 2025, at 8:16 AM, Xingyuan Lin > wrote: > > Congrats Huaxin! > > On Thu, Feb 6, 2025 at 11:11 AM Denny Lee wrote: > >> Congratulations

Re: [VOTE] Add initial/write defaults to REST spec

2025-01-24 Thread Yufei Gu
+1 Yufei On Fri, Jan 24, 2025 at 2:15 PM Amogh Jahagirdar <2am...@gmail.com> wrote: > +1 (binding) > > On Fri, Jan 24, 2025 at 2:02 PM Jean-Baptiste Onofré > wrote: > >> +1 (non binding) >> >> It corresponds to the spec (initial/write default values). >> >> Thanks ! >> Regards >> JB >> >> On Fr

Re: [DISCUSS/VOTE] Add in ChangeLog Reserved Field IDs to Spec and Decrement Row Lineage Reserved IDs

2025-01-24 Thread Yufei Gu
Thanks for fixing this, Russell! +1 for keeping the changelog view related id as is, given the changelog view has been widely used. Yufei On Fri, Jan 24, 2025 at 12:35 PM Russell Spitzer wrote: > We added reserved fields into the Apache Iceberg repo to use with > ChangeLog views but these wer

Re: [DISCUSS, VOTE] OpenAPI Metadata Update for EnableRowLineage

2025-01-23 Thread Yufei Gu
+1 Yufei On Thu, Jan 23, 2025 at 11:05 AM huaxin gao wrote: > +1 (non binding) > > Thanks Russell. > > On Thu, Jan 23, 2025 at 10:55 AM Fokko Driesprong > wrote: > >> +1 >> >> Thanks Russell >> >> Op do 23 jan 2025 om 18:47 schreef Aihua Xu : >> >>> + (non binding). >>> >>> Thanks Russell. >>>

Re: [VOTE] REST API changes for freshness-aware table loading

2025-01-22 Thread Yufei Gu
+1. Thanks, Gabor! A bit more context, we synced on this spec change during this morning's community catalog meeting and reached a general consensus on the approach. Yufei On Wed, Jan 22, 2025 at 12:05 PM Gabor Kaszab wrote: > Hi Iceberg Community, > > I have a PR for changing the REST spec >

Re: [VOTE] Document Snapshot Summary Optional Fields as Subsection of Appendix F in Spec

2025-01-21 Thread Yufei Gu
+1 Thanks Honah! Yufei On Tue, Jan 21, 2025 at 12:45 PM Russell Spitzer wrote: > +1 > > On Tue, Jan 21, 2025 at 2:36 PM rdb...@gmail.com wrote: > >> +1 >> >> On Tue, Jan 21, 2025 at 12:20 PM Honah J. wrote: >> >>> Hi everyone, >>> >>> In the last VOTE >>>

Re: [VOTE] Deprecate IRC snapshot-id Field of SetStatisticsUpdate

2025-01-21 Thread Yufei Gu
+1 Thanks for removing the redundant! Yufei On Tue, Jan 21, 2025 at 9:28 AM Jean-Baptiste Onofré wrote: > +1 (non binding) > > Thanks Christian ! > > Regards > JB > > On Tue, Jan 21, 2025 at 8:25 AM Christian Thiel > wrote: > > > > Hi everyone, > > > > based on good feedback on the [DISCUSS] t

Re: [DISCUSS] Support keeping at most N snapshots

2025-01-16 Thread Yufei Gu
It makes sense to have an option to control the max number of snapshots. Thanks Manu for the proposal. Yufei On Thu, Jan 16, 2025 at 7:46 PM Manu Zhang wrote: > Hi all, > > Do you have more comments on this feature? Do you have concerns about > adding a new field to SnapshotRef? > > Thanks, >

Re: [VOTE] Document Snapshot Summary Optional Fields as Appendix in Spec

2025-01-14 Thread Yufei Gu
+1 Yufei On Tue, Jan 14, 2025 at 1:16 PM Kevin Liu wrote: > +1 non-binding. > Already +1 and reviewed the PR. Thanks for adding this! It's very useful > as a reference. > > Best, > Kevin Liu > > On Tue, Jan 14, 2025 at 12:05 PM Russell Spitzer < > russell.spit...@gmail.com> wrote: > >> +1 >> >>

Re: [DISCUSS] Apache Iceberg 1.7.2 release

2025-01-14 Thread Yufei Gu
Hi folks, We are working on a bug fix, https://github.com/apache/iceberg/issues/11922. It'd be nice to include it in 1.7.2. Yufei On Tue, Jan 14, 2025 at 2:00 AM Jean-Baptiste Onofré wrote: > Hi Fokko > > Thanks for the update. I will do a quick pass on GH issues and I will > run the release

Re: [DISCUSS] REST: Way to query if metadata pointer is the latest

2025-01-03 Thread Yufei Gu
ince it cannot provide an >> ETag. Instead, I’d like to suggest an alternative API: >> >> >> > > Option loadTableIfNoneMatch(TableIdentifier, >> Option) >> >> >> > > Initially, the client would provide None as the tag. If the tag >> is not None and matches the

Re: [DISCUSS] Add a implementation status page for iceberg

2024-12-24 Thread Yufei Gu
;>> On Sat, Nov 16, 2024 at 3:12 AM Kevin Liu >>>> wrote: >>>> >>>>> Thanks, Renjie! Happy to review and help fill out the matrix! :) >>>>> >>>>> Best, >>>>> Kevin Liu >>>>> >>>>> On Wed

Re: [DISCUSS] Standardizing Error Handling in the Iceberg Spark Module

2024-12-19 Thread Yufei Gu
+1 on the direction. It's great that Spark has standardized the error code so that Iceberg didn't have to rely on error messages. Yufei On Thu, Dec 19, 2024 at 8:47 AM rdb...@gmail.com wrote: > This looks like a good improvement to me. Thanks, Huaxin! > > On Wed, Dec 18, 2024 at 11:37 PM huaxi

Re: Optimize object lookup in REST catalog

2024-12-17 Thread Yufei Gu
Seems a nice optimization. I also echo Piotr's point about the list endpoints. Either a `relation` or a `table-like` are good to have. Looking forward to a formal proposal! Yufei On Thu, Dec 5, 2024 at 5:37 AM Piotr Findeisen wrote: > Hi > > I like the idea to just "get relation" to get the re

Re: ​[discuss] Allow 200 responses for HEAD requests in REST API

2024-12-17 Thread Yufei Gu
The distinction between 200 and 204 is subtle enough that I'm comfortable using them interchangeably in this context. My main concern is that, if we make this change, all clients—except for PyIceberg—will need to be updated to support both 200 and 204, since a server could return either status code

Re: [DISCUSS] Spark Catalog - Drop vs Drop with Purge

2024-12-11 Thread Yufei Gu
+1 on adding a flag to support the Spark REST client behavior change between v1.8 and v2.0. At the same time, we may clarify further more on the behavior of DropTable REST API, https://github.com/apache/iceberg/blob/feed4e2544b5839fbc2fe040965af3906d053302/open-api/rest-catalog-open-api.yaml#L1099

Re: REST catalog high availability

2024-12-09 Thread Yufei Gu
Load balancing operates at a different layer than APIs, with various implementations available, such as etcd and Zookeeper. I’d prefer to avoid introducing additional complexity at the web service API level. Yufei On Mon, Dec 9, 2024 at 8:35 AM Jean-Baptiste Onofré wrote: > Hi Vladimir, > > As

Re: [ANNOUNCE] Apache Iceberg release 1.7.1

2024-12-09 Thread Yufei Gu
Thanks a lot, Byran! It's great to have multiple things fixed in the new version. It enables upgrades from varieties of downstream projects! Yufei On Mon, Dec 9, 2024 at 12:37 PM Russell Spitzer wrote: > Thanks so much Bryan! Great work getting this out! > > On Mon, Dec 9, 2024 at 2:28 PM Brya

Re: [VOTE] Release Apache Iceberg 1.7.1 RC1

2024-12-03 Thread Yufei Gu
+1(binding) Verified signature, checksum, and license check. Build passed. Apache Polaris Test suites passed with the rc1. Yufei On Tue, Dec 3, 2024 at 3:58 PM Kevin Liu wrote: > +1 (non-binding) > Thanks for running the release! > > Verified signature, checksum, and license check. Built and

Re: Retry ValidationException with concurrent writes to the same partition

2024-12-03 Thread Yufei Gu
If you’re looking for finer-grained isolation beyond the snapshot level, the closest feature currently *WIP* is *Fine-Grained Commit* in the REST catalog. You can find more details here: Fine-Grained Commit Design Document

Re: [VOTE] Release Apache Iceberg 1.7.1 RC1

2024-12-02 Thread Yufei Gu
>> On Nov 22, 2024, at 12:24 AM, Jean-Baptiste Onofré >> wrote: >> >> Hi Yufei, >> >> As discussed on the dev mailing list (with Fokko), the KEYS file to >> use is: https://dist.apache.org/repos/dist/release/iceberg/KEYS >> >> Regards >> JB >

Re: [VOTE] Release Apache Iceberg 1.7.1 RC1

2024-11-21 Thread Yufei Gu
Hi Bryan, This link seems broken, https://dist.apache.org/repos/dist/dev/iceberg/KEYS. Should we use another one, like the one in here https://downloads.apache.org/iceberg/KEYS? Yufei On Thu, Nov 21, 2024 at 2:36 PM Bryan Keller wrote: > Hi Everyone, > > I propose that we release the followin

Re: [DISCUSS] Iceberg 1.7.1 release

2024-11-20 Thread Yufei Gu
Hi Bryan, This bug fix has been merged. Thanks for taking this in. Fix changelog table bug for start time older than current snapshot: https://github.com/apache/iceberg/pull/11564. Yufei On Fri, Nov 15, 2024 at 9:17 AM Aihua Xu wrote: > That makes sense. Originally I thought wasb scheme chan

Re: [Discuss] Proposal to Adjust Catalog Sync Schedule & Cancel Next Wednesday’s Meeting

2024-11-20 Thread Yufei Gu
Thanks for arranging this! +1 on Keep the Wednesday 9 AM Pacific Time meeting every 3 weeks Yufei On Wed, Nov 20, 2024 at 3:48 PM Honah J. wrote: > Hi everyone, > > Thank you all for your participation in the catalog community sync so far! > I'm writing to discuss changes to the meeting sched

Re: [DISCUSS] Spark 3.3 support?

2024-11-18 Thread Yufei Gu
+1 to deprecate it and remove it. Yufei On Wed, Nov 13, 2024 at 9:17 AM Fokko Driesprong wrote: > +1 to deprecating and removing it > > Kind regards, > Fokko > > Op wo 13 nov 2024 om 18:02 schreef Jean-Baptiste Onofré : > >> +1 to deprecating and removing. >> >> Users can still use previous Ic

Re: [ANNOUNCE] Apache Iceberg Go release v0.1.0

2024-11-18 Thread Yufei Gu
Congrats! Thanks Matt for driving it. Thanks everyone for the contribution! Yufei On Mon, Nov 18, 2024 at 11:27 AM Kevin Liu wrote: > Excited to see the first official release of the Apache Iceberg Go > library! > Thanks everyone for contributing! And thanks Matt & Fokko for working on > the

Re: [DISCUSS] REST: Way to query if metadata pointer is the latest

2024-11-18 Thread Yufei Gu
Cache-Control values in the examples above are intended to ensure that > the client validates freshness with the server on every request. Writing > the header in this extended format is primarily to accommodate outdated > HTTP/1.1 implementations. However, under the HTTP/1.1 specificatio

Re: [DISCUSS] REST: Way to query if metadata pointer is the latest

2024-11-15 Thread Yufei Gu
is function seems to serve a >> different purpose. >> > >> > Here is my suggestion: >> > >> > Since HTTP has built-in caching features ( >> https://developer.mozilla.org/en-US/docs/Web/HTTP/Caching), and REST >> catalogs operate over HTTP, it seems

Re: [DISCUSS] REST: Way to query if metadata pointer is the latest

2024-11-12 Thread Yufei Gu
Hi Gamber, Thanks for the proposal! Impala isn’t unique in needing this—I've seen similar requirements from other engines. As others pointed out, using the “tableExists” endpoint seems like a workaround. I don't consider it a permanent way forward. We could address this by either modifying the cu

Re: [DISCUSS] Duplicate KEYS files

2024-11-11 Thread Yufei Gu
+1 merging sounds good. It should still work for previous releases. Yufei On Mon, Nov 11, 2024 at 7:46 AM Xuanwo wrote: > Hi > > Thank you, Fokko, for proposing this. Here is my +1, non-binding. > > I'd also like to mention that as part of the ASF release policy, we must > refer to "https://do

Re: [DISCUSS] Add a implementation status page for iceberg

2024-11-11 Thread Yufei Gu
LGTM. Thanks Renjie! Yufei On Mon, Nov 11, 2024 at 5:38 AM Renjie Liu wrote: > Hi: > > > One minor suggestion: adding a table spec version label along with the > feature in the support matrix. That doesn't apply to REST spec though. > > Updated the doc, please take a look. > > > My only comment

Re: [DISCUSS] Add a implementation status page for iceberg

2024-11-08 Thread Yufei Gu
+1 looking forward to it. One minor suggestion: adding a table spec version label along with the feature in the support matrix. That doesn't apply to REST spec though. Yufei On Fri, Nov 8, 2024 at 9:31 AM Kevin Liu wrote: > Hi Renjie, > > I absolutely love this idea! I wanted to do something s

Re: [VOTE] Release Apache Iceberg 1.7.0 RC1

2024-11-05 Thread Yufei Gu
+1 (binding) Verified signature, checksum, license, build. Successfully tested the following Spark SQL commands on Polaris, using Spark 3.5.3 with the binary artifacts Iceberg 1.7.0 jar. All operations worked as expected. create database db1; show databases; create table db1.t1 (id int, name st

Re: [DISCUSS] Discrepancy Between Iceberg Spec and Java Implementation for Snapshot summary's 'operation' key

2024-10-17 Thread Yufei Gu
Hi Sung, It seems we are running to issues related to a mismatch between the REST spec and table specifications. Currently, there's no clear definition of how the REST spec is meant to support different table specs. The closest reference I found is this statement

Re: [DISCUSS] Remove iceberg-pig module ?

2024-10-17 Thread Yufei Gu
+1 for deprecating it in 1.7 Yufei On Thu, Oct 17, 2024 at 9:51 AM Ajantha Bhat wrote: > +1 for dropping it. > > On Thu, Oct 17, 2024 at 8:55 PM Daniel Weeks wrote: > >> +1 for deprecating and dropping >> >> On Thu, Oct 17, 2024 at 7:46 AM Eduard Tudenhöfner < >> etudenhoef...@apache.org> wrot

Re: [VOTE] Standardize vended credentials in OpenAPI spec

2024-10-15 Thread Yufei Gu
+1 Yufei On Tue, Oct 15, 2024 at 12:09 PM Daniel Weeks wrote: > +1 > > On Tue, Oct 15, 2024 at 10:42 AM Russell Spitzer < > russell.spit...@gmail.com> wrote: > >> +1 >> >> On Tue, Oct 15, 2024 at 12:28 PM Bryan Keller wrote: >> >>> +1 >>> >>> On Oct 15, 2024, at 10:14 AM, Eduard Tudenhöfner <

Re: [DISCUSS] REST: Refreshing vended credentials

2024-10-14 Thread Yufei Gu
Hi Eduard, Thanks for the proposal. I'm excited about the new spec. I have two questions: 1. This is probably a dumb question due to the lack of context, but I'm a bit confused about how clients should select a prefix to use. In scenarios where multiple prefixes exist, which one should the client

Re: Spec changes for deletion vectors

2024-10-12 Thread Yufei Gu
I’d like to offer a perspective on compatibility. If the design is robust and reasonable, it is certainly welcomed. However, if the design falls short, it becomes a compromise—not just for Iceberg users, but for the entire ecosystem. I look forward to hearing your thoughts on this. Yufei On Fr

Re: [DISCUSS] REST: OAuth2 Authentication Guide

2024-10-12 Thread Yufei Gu
Thanks Christian. Nice write-up! Authentication is essential to a production env. It's great to document it well given a lot of people don't necessarily have enough OAthen2 knowledge. Looking forward to the doc PRs and other client side changes. Yufei On Wed, Sep 18, 2024 at 8:31 AM Dmitri Bourl

Re: [VOTE] Table V3 Spec: Row Lineage

2024-10-10 Thread Yufei Gu
+1 Yufei On Thu, Oct 10, 2024 at 3:47 PM Amogh Jahagirdar <2am...@gmail.com> wrote: > +1, I've been reviewing this proposal/spec change for a bit and I think > it's in a good state for the community to work on an implementation. > > Thanks Russell for driving this! > > On Thu, Oct 10, 2024 at 3:

Re: Bayarea Iceberg meetup in November

2024-10-05 Thread Yufei Gu
Sounds great! Looking forward to it! Yufei On Fri, Oct 4, 2024 at 10:13 AM Kevin Liu wrote: > Excited about this, looking forward to it! > > Best, > Kevin > > On Thu, Oct 3, 2024 at 6:11 PM Aihua Xu > wrote: > >> Hi community! >> >> The Apache Iceberg community is gathering in San Francisco o

Re: Changelog scan for table with delete files

2024-09-30 Thread Yufei Gu
Thanks, Peter and Wing Yew Poon, for tackling these! I’ve been eager to review, but this week has been hectic. I plan to check out PR #10935 next week, though I’d be happy if someone beats me to it. Yufei On Mon, Sep 30, 2024 at 3:02 AM Péter Váry wrote: > Hi Team, > > The Changelog scan Java

Re: [VOTE] Table v3 spec: Add unknown and new type promotion

2024-09-30 Thread Yufei Gu
+1(binding) Yufei On Mon, Sep 30, 2024 at 12:42 PM Amogh Jahagirdar <2am...@gmail.com> wrote: > +1 (binding) > > Thanks, > Amogh Jahagirdar > > On Mon, Sep 30, 2024 at 1:39 PM rdb...@gmail.com wrote: > >> +1 (binding) >> >> On Mon, Sep 30, 2024 at 12:32 PM Daniel Weeks wrote: >> >>> +1 (bindi

Re: [Discuss] Geospatial Support

2024-09-30 Thread Yufei Gu
Thanks Szehon! My comments were addressed. I'm ready to vote. Yufei On Mon, Sep 30, 2024 at 11:47 AM Russell Spitzer wrote: > All my concerns are addressed, I'm ready to vote. > > On Mon, Sep 30, 2024 at 1:21 PM Szehon Ho wrote: > >> Hi all, >> >> There have been several rounds of discussion

Re: [DISCUSS] Iceberg Summit 2025 ?

2024-09-30 Thread Yufei Gu
Thank you, JB, for taking the initiative to get the conversation started for the next Iceberg Summit! I’m really excited to see the community considering a hybrid event for 2025. Having the option for in-person interaction would definitely enhance the sense of connection among contributors and us

Re: [DISCUSS] Optimize for CBO

2024-09-26 Thread Yufei Gu
95497abe5579cf492f24ac8c470c7853d59332e9/core/src/main/java/org/apache/iceberg/PartitionStatsUtil.java#L49 > > On Thu, Sep 26, 2024 at 2:57 PM Yufei Gu wrote: > >> Hi Xingyuan, >> I've been reviewing the partition statistics file, and it seems that >> adding partition-

Re: [DISCUSS] Optimize for CBO

2024-09-26 Thread Yufei Gu
Hi Xingyuan, I've been reviewing the partition statistics file, and it seems that adding partition-level min/max values would be a natural fit within Partition Statistics File[1], which is one file per snapshot. We could introduce a few new fields to accommodate these values. While this addition c

Re: REST Catalog based Integration Test for Query Engines

2024-09-23 Thread Yufei Gu
+1 for using the REST catalog in the tests. Thanks Haizhou for doing this! Yufei On Thu, Sep 19, 2024 at 12:41 AM Eduard Tudenhöfner < etudenhoef...@apache.org> wrote: > Thanks for looking into this Haizhou. I'll take a closer look at the PRs > this/next week. > > Eduard > > On Thu, Sep 19, 202

Re: [VOTE] Drop Python3.8 Support in PyIceberg 0.8.0

2024-09-23 Thread Yufei Gu
+1 Thanks for bringing this up. Yufei On Mon, Sep 23, 2024 at 9:27 AM Kevin Liu wrote: > +1 non-binding. Thanks for starting this conversation! > > > On Fri, Sep 20, 2024 at 2:02 PM Sung Yun wrote: > >> Hi folks, >> >> I'd like to start this thread to vote on dropping the support for >> Pytho

Re: [DISCUSS] September board report

2024-09-11 Thread Yufei Gu
LGTM. Thanks Ryan! Yufei On Wed, Sep 11, 2024 at 8:30 AM Xuanwo wrote: > Thank you, this report looks good to me. Happy to see iceberg-rust been > mentioned. > > On Wed, Sep 11, 2024, at 23:02, Jean-Baptiste Onofré wrote: > > Hi Ryan, > > > > It looks good to me. Thanks ! > > > > Regards > > J

Re: [VOTE] Merge guidelines for committing PRs

2024-08-28 Thread Yufei Gu
+1 (binding) Yufei On Wed, Aug 28, 2024 at 4:56 PM Anton Okolnychyi wrote: > +1 (binding) > > Thanks, Micah! > > ср, 28 серп. 2024 р. о 16:36 Dmitri Bourlatchkov > пише: > >> +1 (nb) >> >> Cheers, >> Dmitri. >> >> On Wed, Aug 28, 2024 at 12:29 PM Micah Kornfield >> wrote: >> >>> I propose to

Re: Request to Add RisingWave to Apache Iceberg Documentation

2024-08-28 Thread Yufei Gu
Hi Alice, Thanks for reaching out. I'm OK with adding related docs. Can you file a PR, so that the community can take a look with more details, or suggest anything based on it. Yufei On Wed, Aug 28, 2024 at 2:05 PM timog...@proton.me.INVALID wrote: > I think they must have meant that RisingWa

Re: [VOTE] Merge REST Spec change to add RemovePartitionSpecsUpdate update type

2024-08-26 Thread Yufei Gu
+1 Yufei On Mon, Aug 26, 2024 at 11:06 AM Ryan Blue wrote: > +1 > > On Mon, Aug 26, 2024 at 11:04 AM Amogh Jahagirdar <2am...@gmail.com> > wrote: > >> I've opened a PR [1] to add a RemovePartitionSpecsUpdate update type so >> that removing partition specs update operation against REST catalogs

Re: [VOTE] REST Endpoint discovery

2024-08-20 Thread Yufei Gu
+1 Yufei On Tue, Aug 20, 2024 at 11:16 AM Eduard Tudenhöfner < etudenhoef...@apache.org> wrote: > Hey everyone, > > I'd like to vote on PR #10928 > which adds a way for REST > servers to communicate to clients what endpoints it supports via a new >

Re: [DISCUSS] REST Endpoint discovery

2024-08-20 Thread Yufei Gu
people are generally onboard with the simple approach of using > *" > " * so maybe we should go with this, wdyt @Yufei? > > > > On Fri, Aug 16, 2024 at 6:50 PM Yufei Gu wrote: > >> I’m OK with using a plain string for the endpoint ID, as described in >&

Re: [DISCUSS] Adding RemovePartitionSpecsUpdate update type to REST

2024-08-19 Thread Yufei Gu
+1, the new spec looks good to me. It seems like the client-side handling the heavy lifting of figuring out which spec to remove is a reasonable approach. Yufei On Mon, Aug 19, 2024 at 4:01 PM Anton Okolnychyi wrote: > Seems reasonable to me. > > - Anton > > пн, 19 серп. 2024 р. о 15:19 Amogh

Re: [VOTE] Spec changes in preparation for v3

2024-08-19 Thread Yufei Gu
+1 Yufei On Mon, Aug 19, 2024 at 1:17 PM Fokko Driesprong wrote: > +1 > > Op ma 19 aug 2024 om 22:01 schreef Russell Spitzer < > russell.spit...@gmail.com>: > >> +1 - Feels duplicative to vote here and approve on the PR >> >> On Mon, Aug 19, 2024 at 2:41 PM Ryan Blue wrote: >> >>> Hi everyone,

Re: [DISCUSS] REST Endpoint discovery

2024-08-16 Thread Yufei Gu
uot;} >>> ] >>> >>> What do people think of that? >>> >>> >>> >>> On Fri, Aug 16, 2024 at 8:13 AM Walaa Eldin Moustafa < >>> wa.moust...@gmail.com> wrote: >>> >>>> Thank you Eduard for sharing th

Re: Spark: Copy Table Action

2024-08-15 Thread Yufei Gu
nice to provide a pure Java implementation in the core, and it >>>> could be extended/reused by different engines, like Spark, to execute it in >>>> a distributed manner, when distributed execution is needed. >>>> >>>> About the copy vs. relative path debate:

Re: [DISCUSS] REST Endpoint discovery

2024-08-15 Thread Yufei Gu
y, but given we limited the scope to service > endpoint discovery, I think that is another discussion later. > > So the current proposed solution from Eduard still feels better to me. And > I think the argument for ambiguity is pretty strong so I am good with the > proposed approach to use

Re: Support row filter & column masking in REST spec

2024-08-15 Thread Yufei Gu
Sorry, I gave the wrong doc, here is the proposal to enable row filtering and column mask: https://docs.google.com/document/d/14nmuxxfzQsYo59o0Fbpb-pxOlzS6bVtduL8P8pwKZ6U/edit#heading=h.irh2zymohx17 Yufei On Thu, Aug 15, 2024 at 9:49 AM Yufei Gu wrote: > Hi Shoham, > > I think this w

Re: Support row filter & column masking in REST spec

2024-08-15 Thread Yufei Gu
Hi Shoham, I think this would be a part of the REST Scan APIs. Here is the proposal, https://docs.google.com/document/d/1FdjCnFZM1fNtgyb9-v9fU4FwOX4An-pqEwSaJe8RgUg/edit#heading=h.cftjlkb2wh4h Yufei On Thu, Aug 15, 2024 at 9:28 AM Shoham Yamin wrote: > Hi what are you thinking about adding in

Re: [DISCUSS] REST Endpoint discovery

2024-08-15 Thread Yufei Gu
+1 for the proposal. In terms of the format, the current solution is simple enough. But I propose to use a trimmed openAPI's format directly. It won't add much cost as we can just take the minimum fields we want. But it opens a window to extend it in the future. For example, it is easier if we want

Re: [DISCUSS] Variant Spec Location

2024-08-14 Thread Yufei Gu
I’m on board with copying the spec into our repository. However, as we’ve talked about, it’s not just a straightforward copy—there are already some divergences. Some of them are under discussion. Iceberg is definitely the best place for these specs. Engines like Trino and Flink can then rely on the

Re: Welcome Péter, Amogh and Eduard to the Apache Iceberg PMC

2024-08-13 Thread Yufei Gu
Congratulations, Peter, Amogh and Eduard! Well deserved! Yufei On Tue, Aug 13, 2024 at 2:03 PM John Zhuge wrote: > Congratulations, everyone! > > On Tue, Aug 13, 2024 at 1:59 PM huaxin gao wrote: > >> Congratulations, everyone! >> >> On Tue, Aug 13, 2024 at 1:53 PM Ryan Blue >> wrote: >> >>>

Re: [VOTE] Merge REST spec clarification on how servers should handle unknown updates/requirements

2024-08-13 Thread Yufei Gu
+1 Yufei On Tue, Aug 13, 2024 at 8:57 AM Eduard Tudenhöfner wrote: > +1 > > On Tue, Aug 13, 2024 at 5:09 PM Amogh Jahagirdar <2am...@gmail.com> wrote: > >> I've opened a PR [1] to clarify in the REST spec that if a server >> receives an unknown update or requirement as part of any the commit >>

Re: [DISCUSS] Clarify in REST spec expected implementation behavior for unknown updates or requirements

2024-08-07 Thread Yufei Gu
Thanks Amogh for starting this discussion. I agree that using 400 makes sense, especially since the server might not fully recognize the request. It’s a straightforward way to handle these situations and avoid potential misunderstanding. Yufei On Tue, Aug 6, 2024 at 5:53 AM Xianjin YE wrote: >

Re: [DISCUSS] Changing namespace separator in REST spec

2024-08-06 Thread Yufei Gu
Thanks Ryan and Daniel for the context. I'm OK that the server provides a separator via config endpoint. Just want to dig a bit more, it does provide more flexibility for the separator, but why do we need this flexibility? Looks like the only benefit is to reconstruct the namespaces. What's the use

Re: [DISCUSS] Extend Snapshot Metadata Lifecycle

2024-08-06 Thread Yufei Gu
gt; addressed that or at least have a way to do that. > > - Anton > > пн, 5 серп. 2024 р. о 18:12 Yufei Gu пише: > >> Thanks Szehone for the new proposal. I think it is a useful feature with >> the least spec change. A candidate for v3 spec? >> >> Yufei

Re: [DISCUSS] Extend Snapshot Metadata Lifecycle

2024-08-05 Thread Yufei Gu
g error checks in those Table API's, and updating ExpireSnapshots >>>>> API. >>>>> >>>>> Do we want to consider expiring snapshots in the middle of the history >>>>>> of the table? >>>>>> >>>>> You mean purging

Re: [VOTE] Merge specification clarifications on reading/writing partition values

2024-08-02 Thread Yufei Gu
+1 (binding) Yufei On Fri, Aug 2, 2024 at 11:18 AM Prashant Singh wrote: > +1 (non-binding) > Thanks Micah ! > > Regards, > Prashant > > On Fri, Aug 2, 2024 at 11:06 AM Micah Kornfield > wrote: > >> I've opened a PR [1] to clarify that partition columns must always be >> written by implementat

Re: [DISCUSS] Changing namespace separator in REST spec

2024-08-02 Thread Yufei Gu
potential to break > existing namespaces. > > What's needed IMHO is likely an escaping mechanism - not a single char. > > On 02.08.24 01:42, Yufei Gu wrote: > > +1 on the first option. We may not overly use the config endpoint, but > it'd be suitable in this case

Re: [DISCUSS] Changing namespace separator in REST spec

2024-08-01 Thread Yufei Gu
+1 on the first option. We may not overly use the config endpoint, but it'd be suitable in this case. We can introduce a new field like this: namespace.separator=%2e Yufei On Thu, Aug 1, 2024 at 3:46 PM Ryan Blue wrote: > I think the simplest way to preserve compatibility is to allow this to

Re: [DISCUSS] Changing namespace separator in REST spec

2024-08-01 Thread Yufei Gu
+1 for replacing it to be compatible with the new Servlet spec. Yufei On Thu, Aug 1, 2024 at 7:02 AM Eduard Tudenhöfner wrote: > Here's the PR that bumps > Jetty and the Servlet API and reproduces #10338 >

Re: [VOTE] Clarify "File System Tables" in the table spec

2024-08-01 Thread Yufei Gu
+1 (binding) Yufei On Thu, Aug 1, 2024 at 8:33 AM Daniel Weeks wrote: > Added comments to the PR to include a target removal version and > appropriate alternative messaging. > > +1 (binding) > > On Thu, Aug 1, 2024 at 8:24 AM Jack Ye wrote: > >> +1 (binding) >> >> -Jack >> >> On Thu, Aug 1, 20

Re: [ANNOUNCE] Apache PyIceberg release 0.7.0

2024-07-31 Thread Yufei Gu
Awesome. Thanks Sung and every contributor! Yufei On Wed, Jul 31, 2024 at 12:44 AM Honah J. wrote: > Thanks Sung for running the release and thanks everyone for contributing! > This is a great milestone for PyIceberg! > > Best regards, > Honah > > On Tue, Jul 30, 2024 at 10:47 PM Renjie Liu >

Re: [VOTE] Drop Java 8 support in Iceberg 1.7.0

2024-07-26 Thread Yufei Gu
+1 (binding) Yufei On Fri, Jul 26, 2024 at 10:05 AM Dmitri Bourlatchkov wrote: > I mean +1 to _drop_ java 8 in 1.7.0 :) > > On Fri, Jul 26, 2024 at 1:04 PM Dmitri Bourlatchkov < > dmitri.bourlatch...@dremio.com> wrote: > >> +1 (nb) to Java 8 support in Iceberg 1.7.0. >> >> Cheers, >> Dmitri. >>

Re: Administration of Apache Iceberg Social/Marketing Channels

2024-07-24 Thread Yufei Gu
+1 for option 3, to publish meetup videos under the Apache Iceberg YouTube channel in a new playlist. We normally trust people to do the right thing. That's how the community thrives. But in case any video violates ASF trademark guidelines, we can still take it down as Jack suggested. Yufei On

Re: Meeting time for catalog community sync

2024-07-24 Thread Yufei Gu
+1 on the proposal of 2 catalog syncs followed by 1 main community sync on Wednesday. Yufei On Wed, Jul 24, 2024 at 1:00 PM Jack Ye wrote: > Hi everyone, > > First of all, thanks everyone that has participated in the sync so far. > > As one of the action items for the catalog community sync me

Re: [ANNOUNCE] Welcoming new committers and PMC members

2024-07-23 Thread Yufei Gu
Congratulations! Thanks a lot for the contribution! Yufei On Tue, Jul 23, 2024 at 9:01 AM Walaa Eldin Moustafa wrote: > Congratulations everyone! Great to see the community growing. > > Thanks, > Walaa. > > On Tue, Jul 23, 2024 at 8:51 AM Alex Dutra > wrote: > >> Congratulations to you all! >

Re: Dropping JDK 8 support

2024-07-22 Thread Yufei Gu
will it go straight to 2.0? >>> >>> On Mon, Jul 22, 2024 at 5:32 PM Manu Zhang >>> wrote: >>> >>>> If JDK 8 support is dropped in 2.0, will we continue to fix critical >>>> issues in 1.6+? >>>> >>>> On Tue, Jul 23, 2024

Re: Dropping JDK 8 support

2024-07-22 Thread Yufei Gu
+1(binding), as much as I want to drop JDK 8, still encourage everyone to spark out about any concerns. Yufei On Mon, Jul 22, 2024 at 10:24 AM Steven Wu wrote: > +1 (binding) > > On Mon, Jul 22, 2024 at 6:37 AM Piotr Findeisen > wrote: > >> Hi, >> >> in the "Building with JDK 21" email thread

Re: [Early Feedback] Variant and Subcolumnarization Support

2024-07-19 Thread Yufei Gu
Agreed with point 1. For point 2, I also prefer to hold the spec and reference implementation under Iceberg. Here are the reasons: 1. It is unconventional and impractical for one engine to depend on another for data types. For instance, it is not ideal for Trino to rely on data types defined by th

Re: [DISCUSS] DROP PARTITION in Spark

2024-07-17 Thread Yufei Gu
Based on my observations, users don't appear to be missing this feature, but I'm OK to add it in Spark for compatibility purposes. Yufei On Wed, Jul 17, 2024 at 11:14 AM Szehon Ho wrote: > Hi Gabor > > I'm neutral for this, but can be convinced. My initial thoughts is that > there would be no

Re: [VOTE] spec: remove the JSON spec for content file and file scan task sections

2024-07-11 Thread Yufei Gu
+1 (binding) Thanks for doing this, Steven. Yufei On Thu, Jul 11, 2024 at 10:16 AM Amogh Jahagirdar <2am...@gmail.com> wrote: > + 1 (non-binding). > > Thanks, > > Amogh Jahagirdar > > On Thu, Jul 11, 2024 at 10:25 AM Péter Váry > wrote: > >> +1 (non-binding) >> >> On Thu, Jul 11, 2024, 17:31 Ja

Re: [DISCUSS] Enable the discussion tab for iceberg github repos

2024-07-11 Thread Yufei Gu
+1. It is a no-brainer to me given it is more search-engine friendly compared to slack and email. It won't have the retention issue from Slack as well. I'd like to see it on other Iceberg repos as well. Yufei On Wed, Jul 10, 2024 at 3:01 PM Jack Ye wrote: > +1 for enabling it on iceberg-rust f

Re: [DISCUSS] Extend Snapshot Metadata Lifecycle

2024-07-09 Thread Yufei Gu
Thank you for the interesting proposal. With a minor specification change, it could indeed enable different retention periods for data files and metadata files. This differentiation is useful for two reasons: 1. More metadata helps us better understand the table history, providing valuable i

Re: [Vote] Deprecate oauth tokens endpoint

2024-07-09 Thread Yufei Gu
+1 Posted my points in the previous email thread. Yufei On Tue, Jul 9, 2024 at 6:15 AM Dmitri Bourlatchkov wrote: > +1 (non-binding) > > I previously posted comments in the GH PR (already addressed). > > Cheers, > Dmitri. > > On Mon, Jul 8, 2024 at 12:15 PM Robert Stupp wrote: > >> Hi Everyo

Spark: Copy Table Action

2024-07-08 Thread Yufei Gu
Hi folks, I'd like to share a recent progress of adding actions to copy tables across different places. There is a constant need to copy tables across different places for purposes such as disaster recovery and testing. Due to the absolute file paths in Iceberg metadata, it doesn't work automatic

Re: [Proposal] REST Spec: Server-side Metadata Tables

2024-07-05 Thread Yufei Gu
think the proposal is very interesting! The direction this and other > proposals are going is IMO the right one. > > Since many proposals need access to at least manifest-lists and manifest > files, potentially also data/delete files, does it make sense to bundle all > proposals that n

[Proposal] REST Spec: Server-side Metadata Tables

2024-07-03 Thread Yufei Gu
Hi folks, I'd like to discuss a new proposal to support server-side metadata tables. One of Iceberg's most advantageous features is the ability to inspect a table using metadata tables. For instance, we can query snapshots just like we query data rows using the following command: SELECT * FROM pr

Re: Addressing security questions in the Iceberg REST specification

2024-05-29 Thread Yufei Gu
ng "/v1/oauth/tokens" and I think I also >>>>>> disagree with the premise that implementing that endpoint is required, >>>>>> but >>>>>> I can understand how that's not clear in the spec. I think we can address >>>>&g

Re: Addressing security questions in the Iceberg REST specification

2024-05-28 Thread Yufei Gu
Not an expert on authentication, but reading from the context, I agree that it’s not a good practice to use a resource server as a token server. The resource server would need to securely handle and store credentials or tokens, increasing the risk of credential theft or leakage. Making the token en

Re: GitHub issue labels

2024-05-28 Thread Yufei Gu
It’s a good idea to send a weekly report. It increases visibility, engages the community, and helps track progress. Key considerations include keeping the report concise, and automating the process. We could categorize issues/PRs with labels. For example, putting the ones triaged and un-triaged i

Re: [DISCUSS] camel-iceberg component

2024-05-21 Thread Yufei Gu
Hi JB, Thanks for sharing. Got a few questions: 1. Does Apache Camel rely on other engines, e.g., Spark or Flink for any processing, or is it fully self-contained? 2. What are the potential challenges or limitations you foresee? For example, does it generate too many commits and/or sm

Re: [Early Feedback] Variant and Subcolumnarization Support

2024-05-11 Thread Yufei Gu
Sounds like a good idea. Looking forward to a proposal. Yufei On Sat, May 11, 2024 at 9:27 AM Amogh Jahagirdar wrote: > Hi all, > > Thanks for raising this thread! Overall I agree that some forms of variant > types would be useful in Iceberg and we've seen interest in support for > that as well

Re: How to Set S3 Credentials at bucket level in Iceberg Spark Session

2024-04-22 Thread Yufei Gu
Hi Awasthi, How about configuring two catalogs in Spark? One points to the source data, and another points to the target. You can configure different credentials in that case. Yufei On Mon, Apr 22, 2024 at 8:49 AM Awasthi, Somesh wrote: > Hi Jack/Dev Team, > > > > We want to pass separate cr

Re: spec question on equality deletes

2024-04-16 Thread Yufei Gu
unds like on 2, your thinking is that (b) is the correct behavior. >> Indeed, I have tried it out with Spark and afaict, it does (b). However, >> that does not mean that it is the correct behavior. The spec should clearly >> define it. >> - Wing Yew >> >> >>

Re: spec question on equality deletes

2024-04-15 Thread Yufei Gu
Hi Wing Yew Poon, Here is my understanding, but not necessarily how an engine implements it. It should only consider the columns in equality_ids when we apply eq deletes. Also the engine should ignore the unrelated columns. It will still delete the row with id 3 in the following case you described

Re: Meeting Minutes 2024-03-27

2024-03-27 Thread Yufei Gu
Nice summary, Renjie! Great to see the community grows and welcome new contributors! Yufei On Wed, Mar 27, 2024 at 7:34 PM Renjie Liu wrote: > Sorry, I missed Himadri Pal for rest catalog: > > 3. Some enhancement of rest catalog, such as oauth, custom headers. Thanks > Himadri Pal(himadripal),

  1   2   >