Re: [DISCUSS] Use pr title + pr description as default git commit title + message in iceberg-rust

2025-01-13 Thread Xuanwo
messages. > > Looking forward to hearing from you! Xuanwo https://xuanwo.io/

Re: [ANNOUNCE] Release Apache Iceberg Rust v0.4.0

2024-12-23 Thread Xuanwo
com/apache/iceberg-rust/issues > - Mailing list: dev@iceberg.apache.org > > Thank you > On behalf of Apache Iceberg Community -- Xuanwo https://xuanwo.io/

Re: [VOTE] Release Apache Iceberg Rust 0.4.0 RC3

2024-12-23 Thread Xuanwo
Checksums and signatures. >> [ ] LICENSE/NOTICE files exist >> [ ] No unexpected binary files >> [ ] All source files have ASF headers >> [ ] Can compile from source >> >> More detailed checklist please refer to: >> https://github.com/apache/iceberg-rust/tree/main/scripts >> >> To compile from source, please refer to: >> https://github.com/apache/iceberg-rust/blob/main/CONTRIBUTING.md >> >> Here is a Python script in release to help you verify the release candidate: >> >> ./scripts/verify.py >> >> Thank you again for your time in helping verify the release. >> >> Sung Xuanwo https://xuanwo.io/

Re: ​[discuss] Allow 200 responses for HEAD requests in REST API

2024-12-18 Thread Xuanwo
890> >>>> and namespace_exists >>>> <https://github.com/apache/iceberg-python/pull/1434/files#diff-3bda7391ebd8aa3dcfd6703d8d2764830b9d9c35fa854188a37d69611274bd3dR882>. >>>> The motivation for this change is to enable more intuitive and >>>> user-friendly integrations with catalogs, as Fokko highlighted here >>>> <https://github.com/apache/iceberg-python/issues/1363#issuecomment-2497462825>. >>>> Standardizing this behavior in the Catalog REST spec would promote >>>> consistency across implementations and make interactions easier for users >>>> and client developers. >>>> Would love to hear your thoughts on this proposal! >>>> Best, >>>> Kevin Liu Xuanwo https://xuanwo.io/

Re: [VOTE] Release Apache Iceberg Rust 0.4.0 RC2

2024-12-17 Thread Xuanwo
://github.com/apache/iceberg-rust/tree/main/scripts > > To compile from source, please refer to: > https://github.com/apache/iceberg-rust/blob/main/CONTRIBUTING.md > > Here is a Python script in release to help you verify the release candidate: > > ./scripts/verify.py > > Thank you! -- Xuanwo https://xuanwo.io/

Re: New committer: Scott Donnelly

2024-12-10 Thread Xuanwo
gt;>> The Project Management Committee (PMC) for Apache Iceberg has invited Scott >>> Donnelly to become a committer. Scott did a lot of impressive work in >>> iceberg-rust, and we are pleased to announce that he has accepted. >>> >>> Please join us in

Re: Storing catalog directly on object store

2024-12-03 Thread Xuanwo
Tue, Nov 26, 2024 at 6:35 PM Nikhil Benesch >>>> > >> wrote: >>>> > >> > >>>> > >> > Hi all, >>>> > >> > >>>> > >> > With Amazon S3 announcing support for the If-Match header yesterday >>>> > >> > [0], all the >>>> > >> > major object store implementations now support a compare-and-swap >>>> > >> > operation. >>>> > >> > >>>> > >> > As far as I can tell, this opens up the possibility of storing >>>> > >> > Iceberg >>>> > >> > catalogs directly on object storage, without the need for a >>>> > >> > separate metastore, >>>> > >> > and without violating any of Iceberg's ACID guarantees. >>>> > >> > >>>> > >> > It seems the immediate next step is to build an independent Java or >>>> > >> > REST catalog >>>> > >> > backend to prove this concept out. Long term, though, the ideal >>>> > >> > would be to >>>> > >> > have such a catalog backend be a first class citizen in the Iceberg >>>> > >> > project. >>>> > >> > >>>> > >> > Is anyone else in the Iceberg community barking up this tree? I'm a >>>> > >> > long term >>>> > >> > Iceberg enthusiast, but new to the community. I'd very much >>>> > >> > appreciate any >>>> > >> > pointers to current or past discussions on the topic. So far all >>>> > >> > I've been >>>> > >> > able to turn up is some light chatter from myself and others on >>>> > >> > Bluesky and >>>> > >> > Hacker News ([1][2][3]). >>>> > >> > >>>> > >> > Cheers, >>>> > >> > Nikhil >>>> > >> > >>>> > >> > [0]: >>>> > >> > https://aws.amazon.com/about-aws/whats-new/2024/11/amazon-s3-functionality-conditional-writes/ >>>> > >> > [1]: https://bsky.app/profile/benesch.bsky.social/post/3lauesxg3ic2c >>>> > >> > [2]: >>>> > >> > https://bsky.app/profile/eatonphil.bsky.social/post/3lbskq3jwk22e >>>> > >> > [3]: https://news.ycombinator.com/item?id=42240370 Xuanwo https://xuanwo.io/

Re: [DISCUSS] iceberg rust 0.4.0 and iceberg pyiceberg_core 0.1.0 release

2024-11-29 Thread Xuanwo
Hi Thank you so much for considering my thoughts. > Shall we sync up through the open tracking issue / slack on those steps? That's will be nice! On Fri, Nov 29, 2024, at 03:32, Sung Yun wrote: > Hi Xuanwo, > > Thank you for sharing your thoughts! I think that makes a stron

Re: [DISCUSS] iceberg rust 0.4.0 and iceberg pyiceberg_core 0.1.0 release

2024-11-28 Thread Xuanwo
>>> Sung >>> >>> [1] https://github.com/apache/iceberg-rust/pull/705 >>> [2] https://lists.apache.org/thread/j22o7yktrlddrgkcy7gl88o23nyrgooc >>> >>> On 2024/09/05 14:06:10 xianjin wrote: >>> > +1 for this pyiceberg_core

Re: Storing catalog directly on object store

2024-11-27 Thread Xuanwo
t; Iceberg enthusiast, but new to the community. I'd very much appreciate any >> > pointers to current or past discussions on the topic. So far all I've been >> > able to turn up is some light chatter from myself and others on Bluesky and >> > Hacker News ([1][2][3]). >> > >> > Cheers, >> > Nikhil >> > >> > [0]: >> > https://aws.amazon.com/about-aws/whats-new/2024/11/amazon-s3-functionality-conditional-writes/ >> > [1]: https://bsky.app/profile/benesch.bsky.social/post/3lauesxg3ic2c >> > [2]: https://bsky.app/profile/eatonphil.bsky.social/post/3lbskq3jwk22e >> > [3]: https://news.ycombinator.com/item?id=42240370 -- Xuanwo https://xuanwo.io/

Re: [VOTE] Release Apache Iceberg 1.7.1 RC1

2024-11-21 Thread Xuanwo
. >> >> [ ] +1 Release this as Apache Iceberg 1.7.1 >> [ ] +0 >> [ ] -1 Do not release this because... >> >> Only PMC members have binding votes, but other community members are >> encouraged to cast >> non-binding votes. This vote will pass if there are 3 binding +1 votes and >> more binding >> +1 votes than -1 votes. >> >> (NOTE: The vote on 1.7.1 RC0 was skipped as a last minute bug fix came in.) Xuanwo https://xuanwo.io/

Re: Fwd: Notification: Iceberg Community Sync (Recorded) @ Thu Nov 14, 2024 1am - 2am (GMT+8) (Manu Zhang)

2024-11-14 Thread Xuanwo
Here is my +1 non-binding. On Fri, Nov 15, 2024, at 12:27, Xuanwo wrote: > Thank you, that will be greatly appreciated. > > On Fri, Nov 15, 2024, at 10:42, Manu Zhang wrote: >> I find the Community Sync time is one hour later for CST (and probably other >> timezones) after

Re: Fwd: Notification: Iceberg Community Sync (Recorded) @ Thu Nov 14, 2024 1am - 2am (GMT+8) (Manu Zhang)

2024-11-14 Thread Xuanwo
ndar <https://calendar.google.com/calendar/> > > You are receiving this email because you are subscribed to calendar > notifications. To stop receiving these emails, go to Calendar settings > <https://calendar.google.com/calendar/r/settings>, select this calendar, and > change

Re: [DISCUSS] Duplicate KEYS files

2024-11-12 Thread Xuanwo
>>>> On Mon, Nov 11, 2024 at 4:13 PM Fokko Driesprong wrote: >>>>> > >>>>> > Hi everyone, >>>>> > >>>>> > While looking at the release steps for iceberg-go, I noticed that we >>>>> > have two KEYS files: >>>>> > >>>>> > https://dist.apache.org/repos/dist/dev/iceberg/KEYS >>>>> > https://dist.apache.org/repos/dist/release/iceberg/KEYS (Also available >>>>> > through https://downloads.apache.org/iceberg/KEYS) >>>>> > >>>>> > The first one is referenced by Java and Python, and the last one by >>>>> > Rust. As mentioned earlier, Go references them both. Should we >>>>> > consolidate these? My suggestion would be to merge the `/dev/` ones >>>>> > into the `release` ones, and get rid of the one in `dev`. Thoughts? >>>>> > >>>>> > Kind regards, >>>>> > Fokko Xuanwo https://xuanwo.io/

Re: [DISCUSS] Duplicate KEYS files

2024-11-11 Thread Xuanwo
;> On Mon, Nov 11, 2024 at 9:45 AM Matt Topol wrote: >>>> +1 (non-binding) for merging, I can update the docs on the iceberg Go >>>> release README after it's done! >>>> >>>> >>>> On Mon, Nov 11, 2024, 12:20 PM Yufei Gu w

Re: [DISCUSS] Duplicate KEYS files

2024-11-11 Thread Xuanwo
and >> the last one by Rust <https://rust.iceberg.apache.org/release.html>. As >> mentioned earlier, Go references them both. Should we consolidate these? My >> suggestion would be to merge the `/dev/` ones into the `release` ones, and >> get rid of the one in `dev`. Thoughts? >> >> Kind regards, >> Fokko Xuanwo https://xuanwo.io/

Re: [VOTE] Iceberg Rust Sync Meeting Time

2024-11-08 Thread Xuanwo
6 okt 2024 om 07:42 schreef NOTME ZE : >>>>> Hi, I prefer From 23:00 to 00:00 GTM+8, last Thursday of each month. But >>>>> both times work for me. >>>>> >>>>> Christian Thiel 于2024年10月25日周五 19:28写道: >>>>>> Hi Renji, thanks for pi

Re: [VOTE] Iceberg Rust Sync Meeting Time

2024-10-25 Thread Xuanwo
;https://lists.apache.org/thread/yxfmg94g1kg6r77rp5xz8j1xndvw86s5> , we want > to start a vote for Iceberg Rust Sync Meeting Time, and here are the options > gathered: > > 1. From Xuanwo: One week before Iceberg Sync Meeting, From 00:00 to 01:00 > GTM+8 > 2. From Renjie +

Re: [DISCUSS] Iceberg Rust Sync Meeting

2024-10-23 Thread Xuanwo
 PM Sung Yun wrote: >> Thank you for starting this thread Xuanwo, I'm +1 for a Iceberg Rust meeting. >> >> Regarding the meeting time, I believe the Iceberg Catalog Community sync >> happens two consecutive weeks, at the same time as the Iceberg community >> sync,

Re: [DISCUSS] Iceberg Rust Sync Meeting

2024-10-11 Thread Xuanwo
and can choose a better time according to the Iceberg Rust > developers? (Perhaps we can have a poll) Hi, xxchan. I propose a time close to the Iceberg Sync Meeting to ensure most community members can join. I'm open to other options. Would you like to suggest one? On Fri, Oct 11, 2024

[DISCUSS] Iceberg Rust Sync Meeting

2024-10-09 Thread Xuanwo
but I will take notes in a Google Doc, similar to what we do in the Iceberg Sync Meeting. What are your thoughts? I'm open to other options as well. Xuanwo https://xuanwo.io/

Re: Iceberg python library sync

2024-09-20 Thread Xuanwo
/8) at 9 AM (UTC-7, PDT) a good time for everyone >>>>>>>>>>> (there is a community sync on 9/1)? >>>>>>>>>>> >>>>>>>>>>> Please join the iceberg-python-sync >>>>>>>>>>> <https://groups.google.com/search?q=iceberg-python-sync> list on >>>>>>>>>>> Google Groups to receive an invitation. >>>>>>>>>>> >>>>>>>>>>> Thanks. >>>>>>>>>>> >>>>>>>>>>> Jun >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Mon, Aug 16, 2021 at 8:04 AM Jun H. wrote: >>>>>>>>>>>> Hi everyone, >>>>>>>>>>>> >>>>>>>>>>>> I have sent the meeting invite using the replied emails in the >>>>>>>>>>>> threads. Here is the meeting agenda: >>>>>>>>>>>> https://docs.google.com/document/d/1oMKodaZJrOJjPfc8PDVAoTdl02eGQKHlhwuggiw7s9U/edit?usp=sharing. >>>>>>>>>>>> >>>>>>>>>>>> Similar to iceberg community sync, please join the >>>>>>>>>>>> iceberg-python-sync >>>>>>>>>>>> <https://groups.google.com/search?q=iceberg-python-sync> list on >>>>>>>>>>>> Google Groups to receive an invitation. >>>>>>>>>>>> >>>>>>>>>>>> Thanks. >>>>>>>>>>>> >>>>>>>>>>>> Jun >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Sat, Aug 14, 2021 at 6:18 AM Uwe L. Korn >>>>>>>>>>>> wrote: >>>>>>>>>>>>> Please also invite me as well. I currently don’t have the time to >>>>>>>>>>>>> join but would be interested in joining in future. >>>>>>>>>>>>> >>>>>>>>>>>>>> Am 13.08.2021 um 23:36 schrieb Ryan Blue : >>>>>>>>>>>>>>  >>>>>>>>>>>>>> Thanks, Jun! >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Fri, Aug 13, 2021 at 2:29 PM Jun H. >>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>> Thanks everyone. I will set up the sync meeting to kick off the >>>>>>>>>>>>>>> discussion at 9 AM (UTC-7, PDT) on 08/18/2021 (coming >>>>>>>>>>>>>>> Wednesday). I will create and share a meeting agenda and notes >>>>>>>>>>>>>>> doc soon. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Best regards, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Jun >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Thu, Aug 12, 2021 at 1:49 PM Szehon Ho >>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>> +1, would love to listen in as well >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>> Szehon >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On 12 Aug 2021, at 12:48, Arthur Wiedmer >>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Hi Jun, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Please add me as well! >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Best, >>>>>>>>>>>>>>>>> Arthur >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Thu, Aug 12, 2021 at 12:19 AM Jun H. >>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>> Hi everyone, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Since early this year, we have started working on the >>>>>>>>>>>>>>>>>> iceberg python library to bring it up to date and support >>>>>>>>>>>>>>>>>> the new V2 spec. Here is a summary of the current feature >>>>>>>>>>>>>>>>>> plan >>>>>>>>>>>>>>>>>> <https://docs.google.com/document/d/1Plt78Gbm22yoybKOShNFNaJyLM7xJAH20nu25PumeMI/edit?usp=sharing>. >>>>>>>>>>>>>>>>>> We have a lot of interesting work to do. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> To keep the community in sync, we plan to set up a recurring >>>>>>>>>>>>>>>>>> iceberg python library sync meeting. Please let me know if >>>>>>>>>>>>>>>>>> you are interested in or have any questions. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Thanks. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Jun >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> Ryan Blue >>>>>>>>>>>>>> Tabular Xuanwo https://xuanwo.io/

Re: [DISCUSS] September board report

2024-09-11 Thread Xuanwo
rs can become committers. This builds on work from last quarter >> that >> clarified the process for design discussions. >> >> Many of the topics under discussion were raised because of the acquisition >> that >> was noted in the last board report. The community has been working to address >> the concerns raised, which are primarily in 3 areas: >> >> How decisions are made about designs and commits (now clarified) >> How contributors become committers and PMC members (under discussion) >> How the community operates when people cannot reach consensus >> >> The last concern has historically not been a problem; people have so far >> chosen to “disagree and commit” when a large majority in the community has >> a different opinion. However, the first instance of this was encountered near >> the end of the quarter. The community and PMC need to discuss how to make >> progress on the issue. -- Xuanwo https://xuanwo.io/

[DISCUSS] iceberg rust 0.4.0 and iceberg pyiceberg_core 0.1.0 release

2024-09-05 Thread Xuanwo
encounter too many breaking changes at once. Additionally, the pyiceberg team is awaiting our first release of pyiceberg_core 0.1.0 so they can integrate with it, see how it works, and explore ways to improve collaboration. What do you think? Xuanwo https://xuanwo.io/

Re: Request to Add RisingWave to Apache Iceberg Documentation

2024-08-29 Thread Xuanwo
rg >>>> <https://docs.risingwave.com/docs/current/ingest-from-iceberg/> >>>> We would greatly appreciate it if you could consider adding RisingWave to >>>> the relevant sections of the Iceberg documentation. If there's any >>>> additional information or collaboration required from our side to make >>>> this happen, please do not hesitate to reach out. >>>> Thank you for your continued support and for fostering such a vibrant and >>>> collaborative community. >>>> >>>> Alice Lyu, GTM @RisingWave >>>> 📧 alice@risingwave.com >>>> 📱 +1 650-772-2096 >>>> 🌐 risingwave.com >>>> 📝 LinkedIn <https://www.linkedin.com/in/alice-ucb/> >>>> >>>> *RisingWave | Real-Time Stream Processing for Modern Data Stack* >>>> Xuanwo https://xuanwo.io/

Re: [DISCUSS] iceberg-rust: pyiceberg_core 0.1.0 Release

2024-08-28 Thread Xuanwo
> Decoupling pyiceberg_core with iceberg-rust may be flexible, but my concern > is that this may not be scalable when we introduce more language bindings. I > think Xuanwo has a lot of experience when maintaining Apache OpenDAL > <https://github.com/apache/opendal> ? OpenDA

Re: [DISCUSS] iceberg-rust: pyiceberg_core 0.1.0 Release

2024-08-27 Thread Xuanwo
c0nkc3k6646lvro1lv22pvhwlp50ss > FileIO: https://lists.apache.org/thread/86zotqs1wqxojt4zx8np29q5doj1l1wc > > PR for exposing transforms as Python bindings: > https://github.com/apache/iceberg-rust/pull/556 Xuanwo https://xuanwo.io/

Re: [DISCUSS] Variant Spec Location

2024-08-22 Thread Xuanwo
ut I don't feel like that should be the initial >> step. >> > > > >>>>>>>>> >> > > > >>>>>>>>> No one is excited about the possibility that the physical >> > > > representations end up diverging, but it feels like we're setting >> > > ourselves >> > > > up for that exact scenario. >> > > > >>>>>>>>> >> > > > >>>>>>>>> -Dan >> > > > >>>>>>>>> >> > > > >>>>>>>>> >> > > > >>>>>>>>> On Wed, Aug 14, 2024 at 6:54 AM Fokko Driesprong < >> > > > fo...@apache.org> wrote: >> > > > >>>>>>>>>> >> > > > >>>>>>>>>> +1 to what's already being said here. It is good to copy >> the >> > > > spec to Iceberg and add context that's specific to Iceberg, but at >> the >> > > same >> > > > time, we should maintain compatibility. >> > > > >>>>>>>>>> >> > > > >>>>>>>>>> Kind regards, >> > > > >>>>>>>>>> Fokko >> > > > >>>>>>>>>> >> > > > >>>>>>>>>> Op wo 14 aug 2024 om 15:30 schreef Manu Zhang < >> > > > owenzhang1...@gmail.com>: >> > > > >>>>>>>>>>> >> > > > >>>>>>>>>>> +1 to copy the spec into our repository. I think the best >> > way >> > > > to keep compatibility is building integration tests. >> > > > >>>>>>>>>>> >> > > > >>>>>>>>>>> Thanks, >> > > > >>>>>>>>>>> Manu >> > > > >>>>>>>>>>> >> > > > >>>>>>>>>>> On Wed, Aug 14, 2024 at 8:27 PM Péter Váry < >> > > > peter.vary.apa...@gmail.com> wrote: >> > > > >>>>>>>>>>>> >> > > > >>>>>>>>>>>> Thanks Russell and Aihua for pushing Variant support! >> > > > >>>>>>>>>>>> >> > > > >>>>>>>>>>>> Given the differences between the supported types and >> the >> > > > lack of interest from the other project, I think it is reasonable to >> > > > duplicate the specification to our repository. >> > > > >>>>>>>>>>>> I would give very strong emphasis on sticking to the >> Spark >> > > > spec as much as possible, to keep compatibility as much as possible. >> > > Maybe >> > > > even revert to a shared specification if the situation changes. >> > > > >>>>>>>>>>>> >> > > > >>>>>>>>>>>> Thanks, >> > > > >>>>>>>>>>>> Peter >> > > > >>>>>>>>>>>> >> > > > >>>>>>>>>>>> Aihua Xu ezt írta (időpont: 2024. >> > aug. >> > > > 13., K, 19:52): >> > > > >>>>>>>>>>>>> >> > > > >>>>>>>>>>>>> Thanks Russell for bringing this up. >> > > > >>>>>>>>>>>>> >> > > > >>>>>>>>>>>>> This is the main blocker to move forward with the >> Variant >> > > > support in Iceberg and hopefully we can have a consensus. To me, I >> also >> > > > feel it makes more sense to move the spec into Iceberg rather than >> > Spark >> > > > engine owns it and we try to keep it compatible with Spark spec. >> > > > >>>>>>>>>>>>> >> > > > >>>>>>>>>>>>> Thanks, >> > > > >>>>>>>>>>>>> Aihua >> > > > >>>>>>>>>>>>> >> > > > >>>>>>>>>>>>> On Mon, Aug 12, 2024 at 6:50 PM Russell Spitzer < >> > > > russell.spit...@gmail.com> wrote: >> > > > >>>>>>>>>>>>>> >> > > > >>>>>>>>>>>>>> Hi Y’all, >> > > > >>>>>>>>>>>>>> >> > > > >>>>>>>>>>>>>> We’ve hit a bit of a roadblock with the Variant >> > Proposal, >> > > > while we were hoping to move the Variant and Shredding specifications >> > > from >> > > > Spark into Iceberg there doesn’t seem to be a lot of interest in >> that. >> > > > Unfortunately, I think we have a number of issues with just linking >> to >> > > the >> > > > Spark project directly from within Iceberg and I believe we need to >> > copy >> > > > the specifications into our repository. >> > > > >>>>>>>>>>>>>> >> > > > >>>>>>>>>>>>>> There are a few reasons why i think this is necessary >> > > > >>>>>>>>>>>>>> >> > > > >>>>>>>>>>>>>> First, we have a divergence of types already. The >> Spark >> > > > Specification already includes types which Iceberg has no definition >> > for >> > > > (19, 20 - Interval Types) and Iceberg already has a type which is not >> > > > included within the Spark Specification (Time) and will soon have >> more >> > > with >> > > > TimestampNS, and Geo. >> > > > >>>>>>>>>>>>>> >> > > > >>>>>>>>>>>>>> Second, We would like to make sure that Spark is not a >> > > hard >> > > > dependency for other engines. We are working with several >> implementers >> > of >> > > > the Iceberg spec and it has previously been agreed that it would be >> > best >> > > if >> > > > the source of truth for Variant existed in an engine and file format >> > > > neutral location. The Iceberg project has a good open model of >> > governance >> > > > and, as we have seen so far discussing Variant, open and active >> > > > collaboration. This would also help as we can strictly version our >> > > changes >> > > > in-line with the rest of the Iceberg spec. >> > > > >>>>>>>>>>>>>> >> > > > >>>>>>>>>>>>>> Third, The Shredding spec is not quite finished and >> > > > requires some group analysis and discussion before we commit it. I >> > think >> > > > again the Iceberg community is probably the right place for this to >> > > happen >> > > > as we have already started discussions here on these topics. >> > > > >>>>>>>>>>>>>> >> > > > >>>>>>>>>>>>>> For these reasons I think we should go with a direct >> > copy >> > > > of the existing specification from the Spark Project and move ahead >> > with >> > > > our discussions and modifications within Iceberg. That said, I do not >> > > want >> > > > to diverge if possible from the Spark proposal. For example, although >> > we >> > > do >> > > > not use the Interval types above, I think we should not reuse those >> > type >> > > > ids within our spec. Iceberg's Variant Spec types 19 and 20 would >> > remain >> > > > unused along with any other types we think are not applicable. We >> > should >> > > > strive whenever possible to allow for compatibility. >> > > > >>>>>>>>>>>>>> >> > > > >>>>>>>>>>>>>> In the interest of moving forward with this proposal I >> > am >> > > > hoping to see if anyone in the community objects to this plan going >> > > forward >> > > > or has a better alternative. >> > > > >>>>>>>>>>>>>> >> > > > >>>>>>>>>>>>>> As always I am thankful for your time and am eager to >> > hear >> > > > back from everyone, >> > > > >>>>>>>>>>>>>> Russ >> > > > >>>>>>>>>>>>>> >> > > > >>>>>>>>>>>>>> >> > > > >> > > >> > >> -- Xuanwo https://xuanwo.io/

Re: [ANNOUNCE] Release Apache Iceberg Rust v0.3.0

2024-08-21 Thread Xuanwo
Thank you for pointing that out. I will update the announcement template. On Wed, Aug 21, 2024, at 15:32, Maxim Solodovnik wrote: > Hello, > > I believe there is a typo in announce: > > On Wed, 21 Aug 2024 at 14:30, Xuanwo wrote: >> >> Hi all, >> >> The A

[ANNOUNCE] Release Apache Iceberg Rust v0.3.0

2024-08-20 Thread Xuanwo
-rust/releases/tag/v0.3.0 Apache Iceberg Rust website: https://rust.iceberg.apache.org/ Download Links: https://rust.iceberg.apache.org/download Iceberg Resources: - Issue: https://github.com/apache/iceberg-rust/issues - Mailing list: dev@iceberg.apache.org Thanks Xuanwo On behalf of Apache

[RESULT][VOTE] Release Apache Iceberg Rust 0.3.0 RC1

2024-08-19 Thread Xuanwo
Thiel Vote thread: https://lists.apache.org/thread/qlx91sjj84t5p2jsc17mk0bh917lrzzf Thanks Xuanwo https://xuanwo.io/

Re: [VOTE] Release Apache Iceberg Rust 0.3.0 RC1

2024-08-18 Thread Xuanwo
d imagine most users want to >> run tests by default as part of release verification? Do let me know if I'm >> missing something. >> >> Thanks, >> Amogh Jahagirdar >> >> On Sat, Aug 17, 2024 at 7:24 AM Renjie Liu wrote: >>> Hi: >>>

Re: [DISCUSS] Variant Spec Location

2024-08-15 Thread Xuanwo
t;>> This is the main blocker to move forward with the Variant support >>>>>>>>>>>>> in Iceberg and hopefully we can have a consensus. To me, I also >>>>>>>>>>>>> feel it makes more sense to move the spec into Iceberg rather >>>>>>>>>>>>> than Spark engine owns it and we try to keep it compatible with >>>>>>>>>>>>> Spark spec. >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> Aihua >>>>>>>>>>>>> >>>>>>>>>>>>> On Mon, Aug 12, 2024 at 6:50 PM Russell Spitzer >>>>>>>>>>>>> wrote: >>>>>>>>>>>>>> Hi Y’all, >>>>>>>>>>>>>> >>>>>>>>>>>>>> We’ve hit a bit of a roadblock with the Variant Proposal, while >>>>>>>>>>>>>> we were hoping to move the Variant and Shredding specifications >>>>>>>>>>>>>> from Spark into Iceberg there doesn’t seem to be a lot of >>>>>>>>>>>>>> interest in that. Unfortunately, I think we have a number of >>>>>>>>>>>>>> issues with just linking to the Spark project directly from >>>>>>>>>>>>>> within Iceberg and *I believe we need to copy the specifications >>>>>>>>>>>>>> into our repository*. >>>>>>>>>>>>>> >>>>>>>>>>>>>> There are a few reasons why i think this is necessary >>>>>>>>>>>>>> >>>>>>>>>>>>>> First, we have a divergence of types already. The Spark >>>>>>>>>>>>>> Specification already includes types which Iceberg has no >>>>>>>>>>>>>> definition for (19, 20 >>>>>>>>>>>>>> <https://github.com/apache/spark/blob/master/common/variant/README.md#encoding-types> >>>>>>>>>>>>>> - Interval Types) and Iceberg already has a type which is not >>>>>>>>>>>>>> included within the Spark Specification (Time) and will soon >>>>>>>>>>>>>> have more with TimestampNS, and Geo. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Second, We would like to make sure that Spark is not a hard >>>>>>>>>>>>>> dependency for other engines. We are working with several >>>>>>>>>>>>>> implementers of the Iceberg spec and it has previously been >>>>>>>>>>>>>> agreed that it would be best if the source of truth for Variant >>>>>>>>>>>>>> existed in an engine and file format neutral location. The >>>>>>>>>>>>>> Iceberg project has a good open model of governance and, as we >>>>>>>>>>>>>> have seen so far discussing Variant >>>>>>>>>>>>>> <https://lists.apache.org/thread/xcyytoypgplfr74klg1z2rgjo6k5b0sq>, >>>>>>>>>>>>>> open and active collaboration. This would also help as we can >>>>>>>>>>>>>> strictly version our changes in-line with the rest of the >>>>>>>>>>>>>> Iceberg spec. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Third, The Shredding spec is not quite finished and requires >>>>>>>>>>>>>> some group analysis and discussion before we commit it. I think >>>>>>>>>>>>>> again the Iceberg community is probably the right place for this >>>>>>>>>>>>>> to happen as we have already started discussions here on these >>>>>>>>>>>>>> topics. >>>>>>>>>>>>>> >>>>>>>>>>>>>> For these reasons I think we should go with a direct copy of the >>>>>>>>>>>>>> existing specification from the Spark Project and move ahead >>>>>>>>>>>>>> with our discussions and modifications within Iceberg. That >>>>>>>>>>>>>> said, *I do not want to diverge if possible from the Spark >>>>>>>>>>>>>> proposal*. For example, although we do not use the Interval >>>>>>>>>>>>>> types above, I think we should *not* reuse those type ids within >>>>>>>>>>>>>> our spec. Iceberg's Variant Spec types 19 and 20 would remain >>>>>>>>>>>>>> unused along with any other types we think are not applicable. >>>>>>>>>>>>>> We should strive whenever possible to allow for compatibility. >>>>>>>>>>>>>> >>>>>>>>>>>>>> In the interest of moving forward with this proposal I am hoping >>>>>>>>>>>>>> to see if anyone in the community objects to this plan going >>>>>>>>>>>>>> forward or has a better alternative. >>>>>>>>>>>>>> >>>>>>>>>>>>>> As always I am thankful for your time and am eager to hear back >>>>>>>>>>>>>> from everyone, >>>>>>>>>>>>>> Russ >>>>>>>>>>>>>> Xuanwo https://xuanwo.io/

Re: [DISCUSS] Cleanup svn dev/iceberg

2024-08-14 Thread Xuanwo
Got it. I will clean them up. On Wed, Aug 14, 2024, at 23:54, Fokko Driesprong wrote: > Hey Xuanwo, > > Feel free to clean those up as they should have been cleaned up a long time > ago. I'm also happy to do it myself, let me know! > > Kind regards, > Fokko >

[VOTE] Release Apache Iceberg Rust 0.3.0 RC1

2024-08-14 Thread Xuanwo
candidate: ./scripts/verify.py Thanks Xuanwo https://xuanwo.io/

[DISCUSS] Cleanup svn dev/iceberg

2024-08-14 Thread Xuanwo
/pyiceberg-0.2.0.tar.gz.sha512 Aiceberg-dev/pyiceberg-0.2.0rc0/pyiceberg-0.2.0.tar.gz How about cleaning up artifacts that have been voted on or cancelled? Xuanwo https://xuanwo.io/

[DISCUSS] Start iceberg-rust 0.3.0 release process

2024-08-13 Thread Xuanwo
/iceberg-rust/issues/543 What do you think? Xuanwo https://xuanwo.io/

Re: Welcome Péter, Amogh and Eduard to the Apache Iceberg PMC

2024-08-13 Thread Xuanwo
17 Russell Spitzer wrote: >>>>>>>> > Hi Y'all, >>>>>>>> > >>>>>>>> > It is my pleasure to let everyone know that the Iceberg PMC has >>>>>>>> > voted to >>>>>>>> > have several talented individuals join us. >>>>>>>> > >>>>>>>> > So without further ado, please welcome Péter Váry, Amogh Jahagirdar >>>>>>>> > and >>>>>>>> > Eduard Tudenhoefner to the Apache Iceberg PMC. >>>>>>>> > >>>>>>>> > As usual I am excited about the future of this community and >>>>>>>> > thankful for >>>>>>>> > the hard work and stewardship of its members. >>>>>>>> > >>>>>>>> > Thank you for your time, >>>>>>>> > Russell Spitzer >>>>>>>> > Xuanwo https://xuanwo.io/

Re: [DISCUSS] Filesystem in PyIceberg

2024-08-12 Thread Xuanwo
gt; >>> Any other suggestions? >>> >>> [1] >>> https://github.com/apache/iceberg/blob/ae08334cad1f1a9eebb9cdcf48ce5084da9bc44d/spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/DeleteOrphanFilesSparkAction.java#L356 >>> [2] >>> https://github.com/apache/iceberg-python/blob/4f33f3a03841c9aa4f6ac389fea5726821f6f116/pyiceberg/io/fsspec.py#L350-L354 >>> [3]https://github.com/apache/iceberg-python/blob/4f33f3a03841c9aa4f6ac389fea5726821f6f116/pyiceberg/io/pyarrow.py#L346-L401 >>> [4] >>> https://github.com/apache/iceberg-python/blob/4f33f3a03841c9aa4f6ac389fea5726821f6f116/pyiceberg/io/pyarrow.py#L1335-L1349 >>> [5] >>> https://github.com/apache/iceberg-python/blob/4f33f3a03841c9aa4f6ac389fea5726821f6f116/pyiceberg/io/pyarrow.py#L1429-L1443 >>> >>> André Anastácio >>> Xuanwo https://xuanwo.io/

Re: [DISCUSS] How about setup a iceberg meetup in Beijing?

2024-08-12 Thread Xuanwo
Thank you, Kevin. The guide is really helpful; I will review and refine my proposal later. On Mon, Aug 12, 2024, at 02:17, Kevin Liu wrote: > Hi Xuanwo, > > Love the idea! We've been hosting the Seattle area meetup for the last couple > of months and are also helping to coordi

[DISCUSS] How about setup a iceberg meetup in Beijing?

2024-08-10 Thread Xuanwo
Hello, everyone I'm starting this thread to discuss the possibility of organizing an iceberg meetup in Beijing. The proposal is available at https://hackmd.io/@xuanwo/apache-iceberg-beijing-meetup-2024-10 Do you love this idea? Xuanwo https://xuanwo.io/

Re: [DISCUSS] Release Avro Java 1.11.4

2024-08-08 Thread Xuanwo
t;> [1]: https://lists.apache.org/thread/7qz947mwlmh7md1dvd6q8587pbyglly7 > > > -- > Ryan Blue > Databricks Xuanwo https://xuanwo.io/

Re: [DISCUSS] Iceberg-rust based Ruby bindings

2024-08-06 Thread Xuanwo
bed and then > resubmitted after talking to Xuanwo on Slack. > > Is there a way to close this accidental duplicate thread? > > On 6 Aug 2024 at 5:06:40 PM, Renjie Liu wrote: >> Hi, Chris: >> >> Seems duplicated with this thread >> <https://lists.apache.or

Re: [VOTE] Vote for a logo of iceberg-rust

2024-08-05 Thread Xuanwo
thub.com/apache/iceberg-rust/discussions/523> on github actions, > so that more people could get involved. > > > > Prior Discussions > > 1. https://apache-iceberg.slack.com/archives/C05HTENMJG4/p1708425331928999 > 2. https://github.com/apache/iceberg-rust/discussions/516 Xuanwo https://xuanwo.io/

Re: [DISCUSS] Iceberg-rust based Ruby bindings

2024-08-05 Thread Xuanwo
33c0nkc3k6646lvro1lv22pvhwlp50ss >> https://github.com/apache/iceberg-rust/pull/518 >> >> *Prior Art in Ruby* >> ** >> https://github.com/matsadler/magnus >> https://github.com/oxidize-rb/rb-sys >> https://github.com/ankane/ruby-polars >> https://github.com/apache/opendal/tree/main/bindings/ruby >> >> Thanks, >> Chris Atkins Xuanwo https://xuanwo.io/

Re: [DISCUSS] Use iceberg-rust as pyiceberg file io

2024-08-05 Thread Xuanwo
rg side. However, for opendalfs fsspec FileIO, we need to parse the properties and convert them into appropriate opendalfs options for it to function properly. On Mon, Aug 5, 2024, at 15:04, Honah J. wrote: > Thanks Xuanwo for driving this and everyone for discussing, > > I like the idea

Re: [DISCUSS] Use iceberg-rust as pyiceberg file io

2024-08-02 Thread Xuanwo
Let's rock! Welcome to take a review: https://github.com/apache/iceberg-rust/pull/518 On Sat, Aug 3, 2024, at 12:13, Xuanwo wrote: > I also support integrating iceberg-rust with pyiceberg rather than building > something new on OpenDAL. > > OpenDAL backed FileIO will be usab

Re: [DISCUSS] Use iceberg-rust as pyiceberg file io

2024-08-02 Thread Xuanwo
t;>> should the pyo3 codes live, in iceberg-rust or in pyiceberg? What kind of >>> interface should we provide to pyiceberg, FileIO or OpenDAL? >> >> Do you have any experience with this? I see many projects having Rust and >> Python code in a single repository. There

Re: [DISCUSS] Use iceberg-rust as pyiceberg file io

2024-08-02 Thread Xuanwo
Hi, renjie Thank you for your support. I'll delve into the details and first build a PoC PR to make it clear. On Fri, Aug 2, 2024, at 22:51, Renjie Liu wrote: > Hi: > > Thanks Xuanwo for raising this. > > As mentioned in another thread, I think using iceberg-rust in py

Re: [DISCUSS] Use iceberg-rust as pyiceberg file io

2024-08-02 Thread Xuanwo
> Xuanwo, would PyIceberg and iceberg-rust share the underlying OpenDAL > implementations via pyo3 / fsspec bindings > <https://github.com/apache/opendal/issues/4511>? Hi, Raschkowski, good question! It's possible. There is an ongoing project developing fsspec bindings f

Re: [DISCUSS] Formalized File IO Properties

2024-07-31 Thread Xuanwo
>>> >>>>> Hello Everyone, >>>>> >>>>> I was considering discussing the standardization of Iceberg properties, >>>>> and I believe this thread could be a great place to start. >>>>> >>>>> I'm writing an

Re: [ANNOUNCE] Welcoming new committers and PMC members

2024-07-23 Thread Xuanwo
t; > The Iceberg PMC is excited to announce new committers and PMC members to the > Apache Iceberg project. > > > New committers: > > > • Kevin Liu (kevinjqliu) > > • Piotr Findeisen (findepi) > > • Sung Yun (syun64) > > • Xuanwo (xuanwo) >

Re: [DISCUSS] Deprecate HadoopTableOperations, move to tests in 2.0

2024-07-17 Thread Xuanwo
ider similar fixes to the table spec. It >> currently describes how HadoopTableOperations works, which does not work in >> object stores or local file systems. HDFS is becoming much less common and I >> propose that we note that the strategy in the spec should ONLY be used with >> HDFS. >> >> What do other people think? >> >> Ryan >> >> -- >> Ryan Blue Xuanwo https://xuanwo.io/

Re: [VOTE] spec: remove the JSON spec for content file and file scan task sections

2024-07-10 Thread Xuanwo
next 72 hours. >>> >>> Thanks, >>> Steven >>> >>> [1] https://github.com/apache/iceberg/pull/9771 >>> [2] https://lists.apache.org/thread/2ty27yx4q0zlqd5h71cyyhb5k47yf9bv >>> >> >> >> -- >> Ryan Blue >> Databricks Xuanwo https://xuanwo.io/

[DISCUSS] Formalized File IO Properties

2024-07-10 Thread Xuanwo
https://github.com/apache/iceberg/blob/2b21020aedb63c26295005d150c05f0a5a5f0eb2/aws/src/main/java/org/apache/iceberg/aws/s3/S3FileIOProperties.java#L46 Xuanwo https://xuanwo.io/

Re: [DISCUSS] Enable the discussion tab for iceberg github repos

2024-07-08 Thread Xuanwo
issues, e.g. >> feature tracking, bug tracking, etc. >> >> So I propose to enable the discussion tab for repos of iceberg and >> subprojects such as iceberg-rust, pyiceberg, iceberg-go. -- Xuanwo https://xuanwo.io/

Re: [DISCUSS] Enable the discussion tab for iceberg github repos

2024-07-08 Thread Xuanwo
ssues, e.g. feature > tracking, bug tracking, etc. > > So I propose to enable the discussion tab for repos of iceberg and > subprojects such as iceberg-rust, pyiceberg, iceberg-go. Xuanwo https://xuanwo.io/

Re: Iceberg Catalog Syncs Invite

2024-06-25 Thread Xuanwo
feel free to edit existing topics or > add more topics. Everyone should have full edit access. > > Looking forward to seeing everyone in the sync meetings! > > Best, > Jack Ye > > > > > > > > > > Xuanwo

Re: Feedback Collection: Bylaws in Iceberg

2024-06-25 Thread Xuanwo
y project is > that any problems you may have, or may be perceived to have, with > neutrality, can and should be solved by inviting more voices to the > conversation, and welcoming committers (especially non-code > committers!), and PMC members, a little earlier than you're entirely > comfortable with. It's not a panacea, but is about as close as we can > get in open source. > > Thanks for doing this hard work. -- Xuanwo

Re: [DISCUSSION] Preparing the Apache iceberg-rust 0.3.0 release

2024-06-19 Thread Xuanwo
). >> >> >> We need to wait for a while for the avro community's release to fix it, but >> it's not expected to be too long. >> >> We can move on with other things first, such as updating documentation. >> >> On Thu, Jun 20, 2024 at 4:39 AM

Re: Agenda Community Sync 19th June

2024-06-18 Thread Xuanwo
ity sync >> tomorrow. There currently is no entry in the google doc. >> >> Best wishes, >> >> Jan Xuanwo

Re: [DISCUSSION] Preparing the Apache iceberg-rust 0.3.0 release

2024-06-15 Thread Xuanwo
batches. > 6. Glue, hive catalog supported, without supporting updates. > 6. Several improvements to the rest catalog. > > I'm happy to volunteer to be the release manager for iceberg-rust 0.3.0. > > Welcome to join the discussion and share your thoughts! Xuanwo

Re: New committer: Renjie Liu

2024-03-09 Thread Xuanwo
better > productivity. A PMC member helps manage and guide the direction of the > project. > > Please join me in congratulating Renjie. > > Cheers, > Fokko Xuanwo

Re: [VOTE] Release Apache Iceberg Rust 0.2.0 RC1

2024-02-18 Thread Xuanwo
n >>>> > >>>> > To learn more about Apache Iceberg, please see >>>> > https://rust.iceberg.apache.org/ >>>> > >>>> > Checklist for reference: >>>> > >>>> > [ ] Download links are valid. >>

Re: [VOTE] Release Apache Iceberg Rust 0.2.0 RC1

2024-02-15 Thread Xuanwo
blob/main/CONTRIBUTING.md > > Huge thanks to: Amogh Jahagirdar, Chengxu Bian, Christian Daudt, Farooq > Qaiser, JanKaul, Manu Zhang, Mark Grey, Renjie Liu, Tyler Schauer, Xiaoyang > Liu, Xuanwo, ZENOTME, barronw, hiirrxnn, y0psolo, yi wang, zhjwpku and of > course dependabot[bot] fo

Re: [DISCUSS] iceberg-rust 0.2.0 release

2024-02-06 Thread Xuanwo
regards, >>>>> Fokko Driesprong >>>>> >>>>> Op wo 31 jan 2024 om 17:28 schreef Jack Ye : >>>>>> Excited about the progress in Rust! +1 for releasing 0.2.0 >>>>>> >>>>>> -Jack >>>>>> >>

Re: [Discuss] Change iceberg-python and iceberg-go CI Settings to only require approval for first time contributors

2024-02-01 Thread Xuanwo
pen a JIRA ticket to request > these changes. > > Previous discussion: > https://lists.apache.org/thread/sp1853jgp1lbdybgzdvv2m5cqhny5skr > > Best regards, > Honah Xuanwo

[DISCUSS] Change iceberg-rust CI Settings to only require approval for new github users

2024-01-31 Thread Xuanwo
So I opened a ticket at [2]. The INFRA team is interested in hearing our community's thoughts on this list. Feel free to leave your comments here. [1] https://apache-iceberg.slack.com/archives/C05HTENMJG4/p1706686077901739 [2] https://issues.apache.org/jira/browse/INFRA-25444 Xuanwo

Re: [DISCUSS] iceberg-rust 0.2.0 release

2024-01-31 Thread Xuanwo
ion for this crate: https://rust.iceberg.apache.org/ > > What's next? > > Eventually we will reach feature parity with java/python api, so that we can > bring full feature support of the iceberg to rust ecosystems. For details of > feature status, please check the `README.md` in github repo > <https://github.com/apache/iceberg-rust/> . Xuanwo

Re: Pagination for List APIs in the REST spec

2023-12-19 Thread Xuanwo
;>>>>> ListNamespacesResponse might allow for more backward >>>>>>>>>>>>> compatibility. In that scenario, pagination would only take >>>>>>>>>>>>> place for clients who know how to paginate and the ordering would >>>>>>>>>>>>> not need to be deterministic. >>>>>>>>>>>>> >>>>>>>>>>>>> -Dan >>>>>>>>>>>>> >>>>>>>>>>>>> On Fri, Dec 15, 2023, 10:33 AM Micah Kornfield >>>>>>>>>>>>> wrote: >>>>>>>>>>>>>> Just to clarify and add a small suggestion: >>>>>>>>>>>>>> >>>>>>>>>>>>>> The behavior with no additional parameters requires the >>>>>>>>>>>>>> operations to happen as they do today for backwards >>>>>>>>>>>>>> compatibility (i.e either all responses are returned or a >>>>>>>>>>>>>> failure occurs). >>>>>>>>>>>>>> >>>>>>>>>>>>>> For new parameters, I'd suggest an opaque start token (instead >>>>>>>>>>>>>> of specific numeric offset) that can be returned by the service >>>>>>>>>>>>>> and a limit (as proposed above). If a start token is provided >>>>>>>>>>>>>> without a limit a default limit can be chosen by the server. >>>>>>>>>>>>>> Servers might return less than limit (i.e. clients are required >>>>>>>>>>>>>> to check for a next token to determine if iteration is >>>>>>>>>>>>>> complete). This enables server side state if it is desired but >>>>>>>>>>>>>> also makes deterministic listing much more feasible >>>>>>>>>>>>>> (deterministic responses are essentially impossible in the face >>>>>>>>>>>>>> of changing data if only a start offset is provided). >>>>>>>>>>>>>> >>>>>>>>>>>>>> In an ideal world, specifying a limit would result in streaming >>>>>>>>>>>>>> responses being returned with the last part either containing a >>>>>>>>>>>>>> token if continuation is necessary. Given conversation on the >>>>>>>>>>>>>> other thread of streaming, I'd imagine this is quite hard to >>>>>>>>>>>>>> model in an Open API REST service. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Therefore it seems like using pagination with token and offset >>>>>>>>>>>>>> would be preferred. If skipping someplace in the middle of the >>>>>>>>>>>>>> namespaces is required then I would suggest modelling those as >>>>>>>>>>>>>> first class query parameters (e.g. "startAfterNamespace") >>>>>>>>>>>>>> >>>>>>>>>>>>>> Cheers, >>>>>>>>>>>>>> Micah >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Fri, Dec 15, 2023 at 10:08 AM Ryan Blue >>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>> +1 for this approach >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I think it's good to use query params because it can be >>>>>>>>>>>>>>> backward-compatible with the current behavior. If you get more >>>>>>>>>>>>>>> than the limit back, then the service probably doesn't support >>>>>>>>>>>>>>> pagination. And if a client doesn't support pagination they get >>>>>>>>>>>>>>> the same results that they would today. A streaming approach >>>>>>>>>>>>>>> with a continuation link like in the scan API discussion >>>>>>>>>>>>>>> wouldn't work because old clients don't know to make a second >>>>>>>>>>>>>>> request. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Thu, Dec 14, 2023 at 10:07 AM Jack Ye >>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>> Hi everyone, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> During the conversation of the Scan API for REST spec, we >>>>>>>>>>>>>>>> touched on the topic of pagination when REST response is large >>>>>>>>>>>>>>>> or takes time to be produced. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I just want to discuss this separately, since we also see the >>>>>>>>>>>>>>>> issue for ListNamespaces and ListTables/Views, when >>>>>>>>>>>>>>>> integrating with a large organization that has over 100k >>>>>>>>>>>>>>>> namespaces, and also a lot of tables in some namespaces. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Pagination requires either keeping state, or the response to >>>>>>>>>>>>>>>> be deterministic such that the client can request a range of >>>>>>>>>>>>>>>> the full response. If we want to avoid keeping state, I think >>>>>>>>>>>>>>>> we need to allow some query parameters like: >>>>>>>>>>>>>>>> - *start*: the start index of the item in the response >>>>>>>>>>>>>>>> - *limit*: the number of items to be returned in the response >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> So we can send a request like: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> *GET /namespaces?start=300&limit=100* >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> *GET /namespaces/ns/tables?start=300&limit=100* >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> And the REST spec should enforce that the response returned >>>>>>>>>>>>>>>> for the paginated GET should be deterministic. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Any thoughts on this? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Best, >>>>>>>>>>>>>>>> Jack Ye >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>> Ryan Blue >>>>>>>>>>>>>>> Tabular >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Ryan Blue >>>>>>>> Tabular >>>>>> >>>>>> >>>>>> -- >>>>>> Ryan Blue >>>>>> Tabular >>>> >>>> >>>> -- >>>> Ryan Blue >>>> Tabular Xuanwo

Re: [PROPOSAL] Apache Iceberg 1.4.3 release

2023-11-15 Thread Xuanwo
fixing CVE-2023-2976. >>> >>> As the Avro CVE is classified high (see >>> https://nvd.nist.gov/vuln/detail/CVE-2023-39410), I propose to bump to >>> Avro 1.11.3 on our 1.4.x branch and release Iceberg 1.4.3 including >>> this. >>> >>> Thoughts ? >>> >>> If there are no objections, I'm volunteer to drive this release. >>> >>> Thanks, >>> Regards >>> JB Xuanwo

Re: [PROPOSAL] Improve dev/check-license

2023-10-27 Thread Xuanwo
]: https://github.com/apache/iceberg-rust/blob/main/.licenserc.yaml [3]: https://github.com/apache/iceberg-rust/blob/94a1c5d7742bc3b2a9ac7c8da20711a5e2578b89/.github/workflows/ci.yml#L38C1-L39C51 On Fri, Oct 27, 2023, at 22:17, Jean-Baptiste Onofré wrote: > Thanks for the heads up Xuanwo. > >

Re: [PROPOSAL] Improve dev/check-license

2023-10-27 Thread Xuanwo
ike: >> >> https://github.com/eskatos/creadur-rat-gradle/blob/master/src/main/kotlin/org/nosphere/apache/rat/RatWork.kt#L135 >> >> https://github.com/eskatos/creadur-rat-gradle/blob/master/src/main/kotlin/org/nosphere/apache/rat/RatWork.kt#L189 >> >> We can include this plugin the check gradle phase, meaning that we can >> verify headers for each PR. >> >> My preference would be for 3, mainly because: >> 1. it integrates smoothly in our gradle ecosystem, adding a new plugin >> as we have gradle-baseline-java, gradle-errorprone-plugin, >> spotless-plugin-gradle, etc >> 2. As we can hook rat gradle plugin in the gradle check task, it means >> license check will be perform at build time, including check on PR by >> GitHub Actions >> >> If you are OK with 3, I will work on: >> 1. a PR to use it >> 2. a PR for website to update release check procedure >> >> Thoughts ? >> >> Regards >> JB -- Xuanwo

Re: Meeting Minutes from 2023-10-11 Iceberg Sync

2023-10-26 Thread Xuanwo
t; Chapter 4 V3 spec > changes for data storage. The team discusses v3 spec changes, including > partition stats, which may not be included in v3 due to a lack of need for > backward compatibility. If partition stats are required for v3, it would need > to be decided and implemented separately from the main v3 discussion. > Everyone should be aware that multi-column transforms are a v3-only change > and are likely to break in v2. There are also some potential forward-breaking > changes for Hadoop v3, including location path requirements and Delete vector > proposal. 34:39 <https://www.youtube.com/watch?v=euWtAKo_bV4&t=2079s> Chapter > 5 Metadata requirements for Iceberg V3 Xuanwo

Re: [VOTE] Release Apache Iceberg 1.4.1 RC0

2023-10-19 Thread Xuanwo
> Xuanwo, if you want to learn more about voting, there is also an Apache page > on it > <https://www.apache.org/foundation/voting.html#expressing-votes-1-0-1-and-fractions> > (that includes some suggestions :). But also feel welcome to ask on the > devlist here. Thanks

Re: [VOTE] Release Apache Iceberg 1.4.1 RC0

2023-10-19 Thread Xuanwo
an fix it > > Regards > JB > > On Thu, Oct 19, 2023 at 10:15 AM Xuanwo wrote: >> __ >>> You can see it’s what I mentioned in my vote email. However, as it’s like >>> this for a while, I voted +1 and I have PRs ready to be submitted >>> (including ra

Re: [VOTE] Release Apache Iceberg 1.4.1 RC0

2023-10-19 Thread Xuanwo
; Hi > > You can see it’s what I mentioned in my vote email. However, as it’s like > this for a while, I voted +1 and I have PRs ready to be submitted (including > rat execution). > > So do you think it’s blocking ? > > Regards > JB > > Le mer. 18 oct. 2023 à 16

Re: [VOTE] Release Apache Iceberg 1.4.1 RC0

2023-10-18 Thread Xuanwo
>>> >>> Please vote in the next 72 hours. >>> >>> [ ] +1 Release this as Apache Iceberg 1.4.1 >>> [ ] +0 >>> [ ] -1 Do not release this because... >>> >>> Only PMC members have binding votes, but other community members are >>> encouraged to cast >>> non-binding votes. This vote will pass if there are 3 binding +1 votes and >>> more binding >>> +1 votes than -1 votes. >>> >>> Xuanwo

Re: [VOTE] Release Apache PyIceberg 0.5.0 RC3

2023-09-14 Thread Xuanwo
ll/7782> >> (improved performance of the JSON (de)serialization) >> • A lot of bugfixes! >> The commit ID is f798b06246e67131d413dfceece5ccaf269e01fe >> >> >> >> • This corresponds to the tag: pyiceberg-0.5.0rc3 >> (37fa779b0957644590a03754a733a5b3e3f589d0) >> • https://github.com/apache/iceberg/releases/tag/pyiceberg-0.5.0rc3 >> • >> https://github.com/apache/iceberg/tree/f798b06246e67131d413dfceece5ccaf269e01fe >> >> >> The release tarball, signature, and checksums are here: >> >> >> >> • https://dist.apache.org/repos/dist/dev/iceberg/pyiceberg-0.5.0rc3/ >> >> >> You can find the KEYS file here: >> >> >> >> • https://dist.apache.org/repos/dist/dev/iceberg/KEYS >> >> >> Convenience binary artifacts are staged on pypi: >> >> >> >> https://pypi.org/project/pyiceberg/0.5.0rc3/ >> >> >> >> And can be installed using: pip3 install pyiceberg==0.5.0rc3 >> >> >> >> Please download, verify, and test. >> >> >> >> Please vote in the next 72 hours. >> >> >> >> [ ] +1 Release this as PyIceberg 0.5.0 >> >> [ ] +0 >> >> [ ] -1 Do not release this because... >> >> >> >> Cheers, Fokko >> >> >> Xuanwo

Re: Location of rust repo

2023-08-08 Thread Xuanwo
t; visibility, easier coordination with the java project and more feedback >>>> from the community. >>>> >>>> The developers currently working on the rust implementation slightly >>>> favor a separate repository but would be okay with using the existing >>>> repository. >>>> >>>> >>>> It would be great if you could share your opinions on the topic. Maybe >>>> this could also be a point for the community sync later today. >>>> >>>> Hope you're all doing well. Best wishes, >>>> >>>> Jan >>> >>> >>> -- >>> Ryan Blue >>> Tabular Xuanwo