Re: Flink Table Maintenance - Tag based locking

2024-08-05 Thread Manu Zhang
Hi Peter, We rely on Airflow to schedule and coordinate maintenance Spark jobs. I agree with Ryan that an Iceberg solution is not a good choice here. Thanks, Manu On Tue, Aug 6, 2024 at 1:07 AM Ryan Blue wrote: > > We can make sure that the Tasks can tolerate concurrent runs, but as > mentione

Re: [DISCUSS] Describing REST Server capabilities

2024-08-05 Thread Walaa Eldin Moustafa
Catching up here. >From Eduard's doc [1], it seems that at the end of the day, the capability boils down to whether an end point is implemented by the server or not. Therefore, I feel we could simplify things by skipping the categorization/grouping (e.g., tables, views, udfs, etc) and just allow s

Re: [DISCUSS] Extend Snapshot Metadata Lifecycle

2024-08-05 Thread Yufei Gu
Thanks Szehone for the new proposal. I think it is a useful feature with the least spec change. A candidate for v3 spec? Yufei On Tue, Jul 16, 2024 at 3:02 PM Szehon Ho wrote: > Hi, > > Thanks for reading through the proposal and the good feedback. I was > thinking about the mentioned concerns

Re: [RESULT][VOTE] Merge specification clarifications on reading/writing partition values

2024-08-05 Thread Amogh Jahagirdar
Thanks Micah and all who voted! I merged the change. Thanks, Amogh Jahagirdar On Mon, Aug 5, 2024 at 6:49 PM Micah Kornfield wrote: > The vote passes with: > > 3 +1 binding votes (Yufei, Daniel, Ryan) > 2 +1 non-binding votes (Micah, Prashant). > > Action items: merge the change. Could a comm

[RESULT][VOTE] Merge specification clarifications on reading/writing partition values

2024-08-05 Thread Micah Kornfield
The vote passes with: 3 +1 binding votes (Yufei, Daniel, Ryan) 2 +1 non-binding votes (Micah, Prashant). Action items: merge the change. Could a committer/PMC member help with this? Thanks, Micah On Mon, Aug 5, 2024 at 8:14 AM Daniel Weeks wrote: > +1 (binding) > > On Fri, Aug 2, 2024 at 1:2

Re: [DISCUSS] adoption of format version 3

2024-08-05 Thread Micah Kornfield
> > I suggest keeping those things separate — Micah, would you mind starting a > separate thread so this one can focus on v3? Yes I'll start another thread on this post V3, to allow for focus on closing off V3 with the current process (and see if there is interest in trying something new for v4.

Re: [DISCUSS] Clarify in REST spec expected implementation behavior for unknown updates or requirements

2024-08-05 Thread Amogh Jahagirdar
I also went back and forth on 400 vs 422 but ultimately concluded that 400 is the correct one to use here. My understanding is that 422 is meant to address semantic issues in the request as opposed to 400 which is typically used for invalid formatted input. As Dan mentioned, in this case the serve

Re: [DISCUSS] adoption of format version 3

2024-08-05 Thread Ryan Blue
At least for discussion purposes, I think the REST spec (and any spec that involves code that will ultimately be consumed) is probably a harder conversation. I agree that it’s a very different conversation and probably out of scope for the table v3 spec. I’m undecided if minor releases are necess

Re: [DISCUSS] Implementing a table-level statistics file to store column statistics

2024-08-05 Thread Alexander Jo
Thanks for starting this thread Huaxin, The existing statistics, on a per data file basis, are definitely too granular for use in planning/analysis time query optimizations. It's worked so far, as tables have been relatively small, but from what I've seen in the Trino community it is starting to b

Re: [DISCUSS] Changing namespace separator in REST spec

2024-08-05 Thread Daniel Weeks
I would agree with adding either a server side (config override) or client side control (query param with `?delim=.`) as it will be compatible with the current v1 endpoint. In the future we could introduce a v2 endpoint(s), but I would want to wait for OpenAPI 4 because they address this by allowi

Re: Flink Table Maintenance - Tag based locking

2024-08-05 Thread Ryan Blue
> We can make sure that the Tasks can tolerate concurrent runs, but as mentioned in the doc, in most cases having concurrent runs are a waste of resources, because of the commit conflicts. Is the problem that users may configure multiple jobs that are all trying to run maintenance procedures? If s

[DISCUSS] Iceberg-rust based Ruby bindings

2024-08-05 Thread Chris Atkins
Hi there, I'm following up on a discussion from the #rust channel on the Iceberg community slack, so starting a thread here too. After seeing Xuanwo's and Song's recent proposals around leveraging iceberg-rust to power part

Re: [DISCUSS] Clarify in REST spec expected implementation behavior for unknown updates or requirements

2024-08-05 Thread Daniel Weeks
I feel like this is a little bit of a gray area in terms of 400 vs 422. While I agree that 422 reads like the right answer just based on the definition of the codes, I think that it will be hard to implement and may not make sense in context of how the server evolves. If a server has not implement

Re: [VOTE] Vote for a logo of iceberg-rust

2024-08-05 Thread Xuanwo
Thanks a lot for driving the vote. It's happy to see we have an iceberg-rust logo! On Mon, Aug 5, 2024, at 20:23, Renjie Liu wrote: > Hi: > > Following prior discussions[1, 2], I want to start a vote for the logo of > iceberg-rust. I've started a poll >

Re: [VOTE] Merge specification clarifications on reading/writing partition values

2024-08-05 Thread Daniel Weeks
+1 (binding) On Fri, Aug 2, 2024 at 1:25 PM Ryan Blue wrote: > +1 (binding) > > On Fri, Aug 2, 2024 at 12:03 PM Yufei Gu wrote: > >> +1 (binding) >> Yufei >> >> >> On Fri, Aug 2, 2024 at 11:18 AM Prashant Singh >> wrote: >> >>> +1 (non-binding) >>> Thanks Micah ! >>> >>> Regards, >>> Prashant

[VOTE] Vote for a logo of iceberg-rust

2024-08-05 Thread Renjie Liu
Hi: Following prior discussions[1, 2], I want to start a vote for the logo of iceberg-rust. I've started a poll on github actions, so that more people could get involved. Prior Discussions 1. https://apache-iceberg.slack.com/archives/C05

Re: [DISCUSS] Iceberg-rust based Ruby bindings

2024-08-05 Thread Xuanwo
Hi, Chris I love this idea. One of the main reasons I started working on iceberg-rust is due to the potential that a rust-powered iceberg core can offer. I'm not an experienced ruby developer, but I'm willing to help with some CI setup or docs since I have some experience in the opendal commun

Re: [DISCUSS] Iceberg-rust based Ruby bindings

2024-08-05 Thread Renjie Liu
Hi, Chris: Thanks for raising this. Generally I'm +1 with building ruby bindings on top of rust implementation, who would help introduce iceberg into the ruby ecosystem. On Mon, Aug 5, 2024 at 7:30 PM Chris Atkins wrote: > Hi there, > > I'm following up on a discussion >

[DISCUSS] Iceberg-rust based Ruby bindings

2024-08-05 Thread Chris Atkins
Hi there, I'm following up on a discussion from the #rust channel on the Iceberg community slack, so starting a thread here too. After seeing Xuanwo's and Song's recent proposals around leveraging iceberg-rust to power part

Re: [DISCUSS] Use iceberg-rust as pyiceberg file io

2024-08-05 Thread Xuanwo
> For the FileIO part, just curious—since Rust's FileIO currently also uses > OpenDAL, will there be any functional differences in terms of supported > storage services or configurations (like profile_name, signer, etc.) compared > to using opendalfs directly in Python in the future? Will Rust's

Re: [DISCUSS] Use iceberg-rust as pyiceberg file io

2024-08-05 Thread Honah J.
Thanks Xuanwo for driving this and everyone for discussing, I like the idea of pushing down low-level logic to Iceberg-rust (pyiceberg_core). It’s great to have another option besides PyArrow for reading and writing data in PyIceberg. Thanks, Xuanwo, for moving this forward with the initial PR to