Re: Wide tables in V4

2025-05-27 Thread Daniel Weeks
I feel like we have two different issues we're talking about here that aren't necessarily tied (though solutions may address both): 1) wide tables, 2) adding columns Wide tables are definitely a problem where parquet has limitations. I'm optimistic about the ongoing work to help improve parquet fo

Re: [VOTE] [REST SPEC] Add row lineage fields.

2025-05-27 Thread Daniel Weeks
+1 (binding) On Fri, May 23, 2025 at 9:49 AM huaxin gao wrote: > +1 (non-binding) > > On Fri, May 23, 2025 at 9:20 AM Yufei Gu wrote: > >> +1 (binding) >> Yufei >> >> >> On Fri, May 23, 2025 at 9:18 AM Jean-Baptiste Onofré >> wrote: >> >>> +1 (non binding) >>> >>> Regards >>> JB >>> >>> Le ven

Re: [VOTE] Add commit timestamp to CommitReport

2025-05-20 Thread Daniel Weeks
Hey Manu, I didn't see a discuss thread on this topic, so I'll add my concerns here. The issue I have is around the fidelity of what we're using as a commit timestamp. I feel like we're adding additional information that isn't accurate and doesn't add a whole lot of value. The metrics already in

Re: [VOTE] Adopt the v3 spec changes

2025-05-20 Thread Daniel Weeks
+1 (binding) Very excited to reach this milestone! -Dan On Tue, May 20, 2025 at 5:26 AM Manu Zhang wrote: > +1 (non-binding). Thanks Ryan for driving this and everyone contributing > to the new features. > > Regards, > Manu > > Péter Váry 于2025年5月20日 周二20:14写道: > >> +1 (binding) >> Well done e

Re: [VOTE] Release Apache Iceberg 1.9.1 RC0

2025-05-18 Thread Daniel Weeks
+1 (binding) Verified sigs/sums/license/build/test Checked that the iceberg build version is correctly represented. Ran into the hadoop commit test timeouts, but succeeded on re-attempt (I believe we have fixes upstream for this). -Dan On Sun, May 18, 2025 at 5:20 PM Steven Wu wrote: > +1 (b

Re: [VOTE] Clarify writer requirements in the spec to prevent orphan DVs

2025-05-14 Thread Daniel Weeks
+1 (binding) On Wed, May 14, 2025 at 9:02 AM Russell Spitzer wrote: > +1 (Binding) > > On Wed, May 14, 2025 at 10:52 AM Anton Okolnychyi > wrote: > >> Hi all, >> >> I propose the following update to the spec to clarify that writers must >> remove any deletion vector that applies to a data file

Re: [DISCUSS] Table Identifiers in Iceberg View Spec

2025-05-08 Thread Daniel Weeks
gt;>>>>>> Walaa. >>>>>>>>> >>>>>>>>> On Tue, Apr 29, 2025 at 10:07 PM Rishabh Bhatia < >>>>>>>>> bhatiarishab...@gmail.com> wrote: >>>>>>>>> >>>>>>>>>> Hello Wa

Re: [VOTE] Minor clarification for Geo Spec

2025-05-07 Thread Daniel Weeks
+1 (binding) On Wed, May 7, 2025 at 7:24 AM Russell Spitzer wrote: > +1 (bind) > > On Wed, May 7, 2025 at 7:32 AM Eduard Tudenhöfner < > etudenhoef...@apache.org> wrote: > >> +1 (binding) >> >> On Wed, May 7, 2025 at 4:14 AM Gang Wu wrote: >> >>> The clarification is simple and clear from the w

Re: [VOTE] Add encryption keys to table metadata

2025-04-30 Thread Daniel Weeks
+1 (binding) On Wed, Apr 30, 2025 at 1:38 PM Amogh Jahagirdar <2am...@gmail.com> wrote: > +1 (binding) > > On Wed, Apr 30, 2025 at 1:29 PM Anurag Mantripragada > wrote: > >> +1 (non-binding) >> >> ~ Anurag Mantripragada >> >> >> >> >> >> >> >> On Apr 30, 2025, at 11:44 AM, Ryan Blue wrote: >> >

Re: [Discuss] Streamlining Release Notes Preparation

2025-04-28 Thread Daniel Weeks
I'm not a big proponent of adding additional process to PRs/contributions. It's very hard to enforce consistently and then we get the worst of both where there are missing contributions and someone still needs to wade through and diff at the time of release. I find the automated release notes

Re: [DISCUSS] Table Identifiers in Iceberg View Spec

2025-04-28 Thread Daniel Weeks
ve inconsistently between queries and views, or > inconsistency between Iceberg and other types of views? > > Thanks, > Walaa > > > On Mon, Apr 28, 2025 at 11:34 AM Daniel Weeks wrote: > >> I would agree with Jan's summary of why 'default-catalog' was i

Re: [DISCUSS] Table Identifiers in Iceberg View Spec

2025-04-28 Thread Daniel Weeks
I would agree with Jan's summary of why 'default-catalog' was introduced, but I think we need to step back and align on what we are really attempting to support in the spec. The issues we're discussing largely stem from using multiple engines with cross catalog references and configurations where

Re: [VOTE] Small spec change for default values

2025-04-23 Thread Daniel Weeks
+1 (binding) On Wed, Apr 23, 2025, 3:12 PM Anton Okolnychyi wrote: > +1 (binding) > > The proposed V3 behavior would already be a lot more flexible than what > most engines support in the industry today. It is also not covered by the > SQL standard, so there is no need to overcomplicate the spec

Re: [VOTE] Spec Update: Variant Field Lower/Upper Bounds

2025-04-19 Thread Daniel Weeks
+1 (binding) On Sat, Apr 19, 2025, 12:07 AM Gang Wu wrote: > +1 (non-binding) > > On Sat, Apr 19, 2025 at 12:12 PM Steve Zhang > wrote: > >> +1 (non-binding) >> >> Thanks, >> Steve Zhang >> >> >> >> On Apr 18, 2025, at 1:29 PM, huaxin gao wrote: >> >> +1 (non-binding) >> >> >>

Re: [VOTE] Simplify multi-argument field-id(s) encoding

2025-04-17 Thread Daniel Weeks
+1 (binding) On Thu, Apr 17, 2025 at 8:41 AM Russell Spitzer wrote: > +1 (Bind) > > On Thu, Apr 17, 2025 at 8:14 AM Jean-Baptiste Onofré > wrote: > >> +1 (non binding) (as said in the PR :)) >> >> Thanks ! >> >> Regards >> JB >> >> On Thu, Apr 17, 2025 at 3:00 PM Fokko Driesprong >> wrote: >>

Re: [VOTE] Update row lineage spec ID assignment

2025-04-17 Thread Daniel Weeks
+1 (binding) I think this update really helps ensure row ids will be present and reliable for upgraded tables. Thanks Ryan! On Wed, Apr 16, 2025 at 4:09 PM Ryan Blue wrote: > Hi everyone, > > I’d like to start a vote to incorporate the spec changes in PR 12781 >

Re: Hive 4 support

2025-04-11 Thread Daniel Weeks
Hey Wing, There are a few concerning issues I have with the current PR: 1) We need to update LICENSE/NOTICE for the hive3/hive4 dependencies because I believe we only looked at the referenced version 2) We're producing artifacts for hive3 and hive4 modules which I think we want to exclude (we sho

Re: Optimize Equality Deletes with Sorting

2025-04-07 Thread Daniel Weeks
Hey Edgar, Thanks for the well articulated proposal. I'm a little concerned that the proposed approach only partially addresses the underlying challenge with equality deletes. Equality deletes are extremely powerful because you can delete a row anywhere in the dataset without any read cost. The

Re: [VOTE] Row lineage required for v3

2025-04-05 Thread Daniel Weeks
Thanks to everyone for taking the time to discuss and vote! I'm going to call the vote as passing with: 9 +1 binding votes 6 +1 non-binding votes (no 0/-1 votes cast) -Dan On Tue, Apr 1, 2025 at 12:22 PM Fokko Driesprong wrote: > +1 > > Op di 1 apr 2025 om 21:15 schreef Péter Váry > >> +1 >>

Re: [VOTE] Minor simplifications for Geo Spec

2025-04-04 Thread Daniel Weeks
+1 On Wed, Mar 19, 2025 at 1:50 PM Jean-Baptiste Onofré wrote: > +1 (non binding) > > Regards > JB > > On Wed, Mar 19, 2025 at 1:01 AM Szehon Ho wrote: > > > > Hi everyone, > > > > While working on the reference implementation for Geometry/Geography > spec, we noticed some parts that can be sim

[VOTE] Row lineage required for v3

2025-03-31 Thread Daniel Weeks
Hey Everyone, I'd like to raise the proposal to make row-lineage required by default to a vote. There was general support for this change in the discussion thread with the reasoning t

[DISCUSS] Row lineage required for v3

2025-03-19 Thread Daniel Weeks
Hey everyone, When Row lineage was originally introduced, it was believed to be incompatible with equality deletes and we initially added lineage as a feature that could be turned on. Now that these features can co-exist , we would

Re: [VOTE] Improve OpenAPI documentation around how NamespaceNotEmptyException is treated

2025-03-18 Thread Daniel Weeks
+1 as well for 409 On Tue, Mar 18, 2025 at 1:43 PM Ryan Blue wrote: > +1 for the updated 409 code. > > On Tue, Mar 18, 2025 at 1:41 PM Yufei Gu wrote: > >> +1. Thanks Eduard! >> >> Yufei >> >> >> On Tue, Mar 18, 2025 at 3:46 AM Eduard Tudenhöfner < >> etudenhoef...@apache.org> wrote: >> >>> I h

Re: [DISCUSS] Rename iceberg repo to iceberg-java ?

2025-03-14 Thread Daniel Weeks
I'm not in favor of the rename at this point. This just seems unnecessary and since the java implementation is the reference implementation I don't think there's real confusion caused by leaving the repo as it is. I know this has been done in other projects like `parquet-mr` -> `parquet-java`, bu

Re: [VOTE] Release Apache Iceberg 1.8.1 RC1

2025-02-27 Thread Daniel Weeks
+1 (binding) Verified sigs/sums/license/build/test (Java 17) -Dan On Thu, Feb 27, 2025 at 11:30 AM Aihua Xu wrote: > Thanks Alex. > > On Thu, Feb 27, 2025 at 10:31 AM Alex Dutra > wrote: > >> Hi Aihua, >> >> I was indeed suspecting that you had a custom RESTClient :-) Thanks for >> digging fu

Re: [VOTE] Java implementation notes around current-snapshot-id

2025-02-24 Thread Daniel Weeks
+1 On Mon, Feb 24, 2025, 11:00 AM Russell Spitzer wrote: > +1 > > On Mon, Feb 24, 2025 at 12:55 PM Fokko Driesprong > wrote: > >> Hi everyone, >> >> Recently, there was confusion >> about >> valid values for the current-snapshot

Re: [DISCUSS] Rest Catalog 419 Response Code

2025-02-20 Thread Daniel Weeks
Hey Sung, My interpretation is that it's up to the REST Server to decide whether to send a 419 or 401 response code (I don't think it's a mandate). The use case for 419 would be that the client has client credentials or can re-authenticate via some other mechanism and could reattempt the request.

Re: [ANNOUNCE] Apache Iceberg release 1.8.0

2025-02-18 Thread Daniel Weeks
is is not required for other clients? > > Unless we say that it is required we can have other implementations which > can return null or missing. I'm not a big fan of changing the spec to match > the implementation especially when we are narrowing what is allowed. > > > On

Re: [ANNOUNCE] Apache Iceberg release 1.8.0

2025-02-18 Thread Daniel Weeks
I would agree the best path forward is to note the current behavior for v1/v2 since that's well established and address the behavior in v3. For compatibility with existing libraries, we should maintain that `-1` is equivalent to no snapshot and it should be written for v1/v2. With V3 we should su

Re: [ANNOUNCE] Apache Iceberg release 1.8.0

2025-02-13 Thread Daniel Weeks
I think we wanted to perform a check against the endpoint support like other view operations since they aren't expected to be implemented. PR here is what I believe we want. -Dan On Thu, Feb 13, 2025 at 7:37 AM Eduard Tudenhöfner wrote: > This is

Re: [VOTE] Add overwriteRequested to RegisterTableRequest in REST spec

2025-02-13 Thread Daniel Weeks
+1 On Thu, Feb 13, 2025 at 9:07 AM Fokko Driesprong wrote: > +1 > > Op do 13 feb 2025 om 18:06 schreef Steven Wu : > >> +1 here. >> >> already approved the PR yesterday >> >> On Thu, Feb 13, 2025 at 8:17 AM Russell Spitzer < >> russell.spit...@gmail.com> wrote: >> >>> +1 >>> >>> On Wed, Feb 12,

Re: [VOTE] Release Apache Iceberg 1.8.0 RC0

2025-02-12 Thread Daniel Weeks
+1 (binding) Verified sigs/sums/license/build/test (Java 17) I also manually tested a number of cases with format v3 and DVs. -Dan On Wed, Feb 12, 2025 at 8:57 AM Ajantha Bhat wrote: > +1 (non-binding) > > * validated checksum and signature > * checked license docs & ran RAT checks > * ran bu

Re: Table metadata swap not work for REST Catalog (#12134)

2025-02-10 Thread Daniel Weeks
Hey Steve, I think the issue here is that you're using the commit api in table operations to perform a non-incremental/linear change to the metadata. The REST implementation is a little more strict in that it builds a set of updates based on the mutations made to the metadata and the commit proce

Re: Proposal: Parquet footer size in Iceberg metadata

2025-02-09 Thread Daniel Weeks
" will be >> enough, which is what a draft PR i have for hadoop fileIO does. >> >> On Wed, 22 Jan 2025 at 03:39, Sreeram Garlapati >> wrote: >> >>> Thanks for the nice idea/suggestion, Dan. >>> Yes, we have been employing a similar technique that y

Re: [VOTE] Add Geometry and Geography types for V3

2025-02-06 Thread Daniel Weeks
+1 On Thu, Feb 6, 2025, 4:02 PM Russell Spitzer wrote: > +1 > > On Fri, Feb 7, 2025 at 12:57 AM Denny Lee wrote: > >> +1 (non-binding) - super exciting! >> >> On Thu, Feb 6, 2025 at 3:52 PM rdb...@gmail.com wrote: >> >>> +1 >>> >>> Awesome to see this ready to go! >>> >>> On Thu, Feb 6, 2025 a

Re: Welcome Huaxin Gao as a committer!

2025-02-06 Thread Daniel Weeks
Congrats Huaxin! On Thu, Feb 6, 2025 at 11:52 AM Anurag Mantripragada wrote: > Congratulations, Huaxin! Great work! > > ~ Anurag Mantripragada > > > > On Feb 6, 2025, at 10:36 AM, Honah J. wrote: > > Congratulations Huaxin! > > On Thu, Feb 6, 2025 at 10:32 AM John Zhuge wrote: > >> Congratulat

Re: missing files in an Iceberg table

2025-01-28 Thread Daniel Weeks
Hey Wing Yew, I would agree that this is a common problem and we need a way to get tables back into a good state when something unexpected happens. Amogh and Matt have a PR (API: Define RepairManifests action interface #10784) that was originally

Re: [VOTE] Add initial/write defaults to REST spec

2025-01-28 Thread Daniel Weeks
t; >>>>>> +1 >>>>>> Yufei >>>>>> >>>>>> >>>>>> On Fri, Jan 24, 2025 at 2:15 PM Amogh Jahagirdar <2am...@gmail.com> >>>>>> wrote: >>>>>> >>>>>>> +1 (b

Re: [discuss] Standardizing Naming Schemes for Language-Specific Configurations

2025-01-27 Thread Daniel Weeks
We've run into this same issue in a number of cases previously and I think where we want to ideally go (in most cases) is language agnostic properties/values. For example, the property for `catalog-impl`

Re: [VOTE] REST API changes for freshness-aware table loading

2025-01-24 Thread Daniel Weeks
+1 On Wed, Jan 22, 2025 at 1:19 PM Yufei Gu wrote: > +1. Thanks, Gabor! A bit more context, we synced on this spec change > during this morning's community catalog meeting and reached a general > consensus on the approach. > > Yufei > > > On Wed, Jan 22, 2025 at 12:05 PM Gabor Kaszab > wrote: >

[VOTE] Add initial/write defaults to REST spec

2025-01-24 Thread Daniel Weeks
Everyone, I'd like to hold a quick vote for a small addition to the REST spec to include the initial/write defaults introduced in v3 as optional fields to the table schema. PR: OpenAPI: add initial/write defaults to schema #12094 Thanks, Dan

Re: [DISCUSS, VOTE] OpenAPI Metadata Update for EnableRowLineage

2025-01-23 Thread Daniel Weeks
+1 Sorry to have missed the discussion, but I'm onboard with the proposed changes. On Thu, Jan 23, 2025 at 11:27 AM Yufei Gu wrote: > +1 > Yufei > > > On Thu, Jan 23, 2025 at 11:05 AM huaxin gao > wrote: > >> +1 (non binding) >> >> Thanks Russell. >> >> On Thu, Jan 23, 2025 at 10:55 AM Fokko D

Re: [DISCUSS, VOTE] OpenAPI Metadata Update for EnableRowLineage

2025-01-22 Thread Daniel Weeks
Just a minor question added to the PR. We're adding an explicit 'enable' as an update type and I wonder if it would be better to generalize it so that we don't have separate updates to disable/enable (more forward thinking as this is the first case quite like this). -Dan On Wed, Jan 22, 2025 a

Re: Proposal: Parquet footer size in Iceberg metadata

2025-01-21 Thread Daniel Weeks
Hey Sreeram, I think it's worthwhile to consider what value would be added by tracking the footer size in metadata, but there are other options to address these optimization use cases. For example, if you take a look at the RangeReadable

Re: [VOTE] Document Snapshot Summary Optional Fields as Subsection of Appendix F in Spec

2025-01-21 Thread Daniel Weeks
+1 (binding) On Tue, Jan 21, 2025 at 1:05 PM Szehon Ho wrote: > +1 (binding) > > Thanks > Szehon > > On Tue, Jan 21, 2025 at 12:55 PM Yufei Gu wrote: > >> +1 Thanks Honah! >> >> Yufei >> >> >> On Tue, Jan 21, 2025 at 12:45 PM Russell Spitzer < >> russell.spit...@gmail.com> wrote: >> >>> +1 >>>

Re: [DISCUSS] Support keeping at most N snapshots

2025-01-21 Thread Daniel Weeks
Hey Manu, I think I understand what you're trying to achieve here and I feel like the most important part is to have an updated version of the retention procedure to clearly state how this interacts with the other settings as part of the

Re: [DISCUSS] Add metadata stats/metrics management on the REST Spec

2025-01-21 Thread Daniel Weeks
Hey JB, I'm not sure I fully understand what the proposal is, but I also realise it's probably not completely fleshed out yet. When you say "manage metadata", the first concern that I have is whether you mean to just query/get the info or to also modify it. Table metadata is immutable and requir

Re: [VOTE] Deprecate IRC snapshot-id Field of SetStatisticsUpdate

2025-01-21 Thread Daniel Weeks
+1 On Tue, Jan 21, 2025 at 8:38 AM Eduard Tudenhöfner wrote: > +1 > > On Tue, Jan 21, 2025 at 5:34 PM Marc Cenac > wrote: > >> +1 non-binding >> >> On Tue, Jan 21, 2025 at 8:19 AM Sung Yun wrote: >> >>> +1 non-binding >>> >>> Thanks for driving this Christian! >>> >>> On 2025/01/21 12:39:26 Ru

Re: [Discuss][Vote] Spec Change - Add optional field added-rows to Snapshot for Row Lineage

2025-01-16 Thread Daniel Weeks
+1 On Thu, Jan 16, 2025 at 10:39 AM Steve Zhang wrote: > Thank you Russell! +1 (non-binding) > > Thanks, > Steve Zhang > > > > On Jan 15, 2025, at 10:53 PM, huaxin gao wrote: > > +1 (non-binding) > > >

Re: [DISCUSS] Apache Iceberg (java) 1.8.0 release

2025-01-16 Thread Daniel Weeks
Robert, I hear your frustration with the progress on the Auth Manager work, but I believe everyone recognizes that this was a large refactor further complicated by the need to preserve backward compatibility and handling deprecations appropriately. This work has gone through many iterations as we

Re: [VOTE] Document Snapshot Summary Optional Fields as Appendix in Spec

2025-01-15 Thread Daniel Weeks
ames for consistency across implementations. -Dan On Wed, Jan 15, 2025 at 8:07 AM Russell Spitzer wrote: > @Daniel Weeks what do you think? > > I know both you and I had the opposite feeling here. > > On Tue, Jan 14, 2025 at 6:21 PM rdb...@gmail.com wrote: > >> The c

Re: [VOTE] Drop Hive runtime

2024-12-20 Thread Daniel Weeks
+1 On Wed, Dec 18, 2024 at 10:41 PM Jean-Baptiste Onofré wrote: > +1 (non binding) > > I did a pass on the PRs and they look good to me. > > Thanks Manu ! > Regards > JB > > On Wed, Dec 18, 2024 at 2:59 AM Manu Zhang > wrote: > > > > Hi all, > > > > Thanks for sharing your ideas in the discussi

Re: [Discuss] Document Snapshot Summary Optional Fields for Standardization

2024-12-12 Thread Daniel Weeks
I'm generally in support of this as well, but I think we should put this in an appendix as opposed to the main body of the spec. -Dan On Wed, Dec 11, 2024 at 2:18 PM Russell Spitzer wrote: > I want to float this back up, I think this is a really good idea for cross > engine support. I don't thi

Re: [DISCUSS] Hive Support

2024-12-12 Thread Daniel Weeks
Hey Manu, I agree with the direction here, but we should probably hold a quick procedural vote just to confirm since this is a significant change in support for Hive. -Dan On Wed, Dec 11, 2024 at 5:19 PM Manu Zhang wrote: > Thanks all for sharing your thoughts. It looks there's a consensus on

Re: [DISCUSS] Relocate Parquet to Iceberg Core

2024-12-06 Thread Daniel Weeks
we have any strong use case or feature that requires it now? > b) I hope we do the same for ORC as well as it looks odd to have a > module for that? > > - Ajantha > > On Sat, Dec 7, 2024 at 5:22 AM Daniel Weeks wrote: > >> Everyone, >> >> I wanted to propose

[DISCUSS] Relocate Parquet to Iceberg Core

2024-12-06 Thread Daniel Weeks
Everyone, I wanted to propose moving the parquet implementation from the 'iceberg-parquet' project to the 'iceberg-core' project. The original motivation for keeping these subprojects separate was due to Iceberg relying on avro (which is included in the core project) for metadata and keeping othe

Re: [VOTE] Deprecate and remove last-column-id

2024-11-20 Thread Daniel Weeks
+1 On Wed, Nov 20, 2024, 9:41 AM Aihua Xu wrote: > +1 non-binding. > > Thanks for driving this. > > On Tue, Nov 19, 2024 at 5:50 PM Renjie Liu > wrote: > >> +1, thanks Fokko! >> >> On Wed, Nov 20, 2024 at 8:45 AM Steve Zhang >> wrote: >> >>> +1 nb >>> >>> Thanks, >>> Steve Zhang >>> >>> >>> >>

Re: [VOTE] Release Apache PyIceberg 0.8.0rc2

2024-11-18 Thread Daniel Weeks
+1 (binding) Verified sigs/sums/license/tests+s3 (Python 3.11.9) -Dan On Sat, Nov 16, 2024 at 4:03 PM André Luis Anastácio wrote: > +1 (non-binding) > > - verified signature and checksum > - verified license check > - ran install and some manual tests in python 3.11 > > André Anastácio > > On

Re: [VOTE] Release Apache PyIceberg 0.8.0rc1

2024-11-12 Thread Daniel Weeks
+1 (binding) Verified sigs/sums/license/tests (Python 3.11.9) Noted a couple small things that I don't think are blockers: 1. When attempting to use data_files metadata tables, I needed to install both pyarrow and s3fs. I would have expected that I would just need pyarrow if that was the FileIO

Re: [VOTE] Release Apache Iceberg 1.7.0 RC1

2024-11-06 Thread Daniel Weeks
+1 (binding) Verified sigs/sums/license/build/test (Java 17) -Dan On Wed, Nov 6, 2024 at 3:23 PM Jack Ye wrote: > +1 (binding) > > - Verified signature, checksum, license > - Ran build and test with JDK 11 and 17 > - Ran AWS integration tests > - Ran on Spark 3.5 with some manual tests > > Bes

Re: [VOTE] Release Apache Iceberg 1.7.0 RC0

2024-11-01 Thread Daniel Weeks
+1 (binding) Verified sigs/sums/license/build/test Also did some manual verification using spark and everything checks out. -Dan On Fri, Nov 1, 2024 at 10:52 AM Steven Wu wrote: > +1 (binding) > > Verified signature, checksum, license. Did Flink SQL local testing with > the runtime jar. > > D

Re: [DISCUSS] Partial Metadata Loading

2024-10-31 Thread Daniel Weeks
seems true that "full metadata doesn't align with > *almost > all* use cases". > > Even if most use cases do need 90% of the metadata, it seems like a useful > optimization for the client to not have to request whatever it doesn't > need. This also gives u

Re: [DISCUSS] Partial Metadata Loading

2024-10-31 Thread Daniel Weeks
I'd like to clarify my concerns here because I think there are more aspects to this than we've captured. *Partial metadata loads adds significant complexity to the protocol* Iceberg metadata is a complicated structure and finding a way to represent how and what we want to piece apart is non-trivia

Re: [VOTE] Deletion Vectors in V3

2024-10-30 Thread Daniel Weeks
+1 (binding) -Dan On Wed, Oct 30, 2024 at 10:51 AM Prashant Singh wrote: > +1 (non-binding) > > Thanks, > Prashant > > On Wed, Oct 30, 2024 at 10:16 AM Russell Spitzer < > russell.spit...@gmail.com> wrote: > >> +1 - My comments are clear on the other thread for future proposals >> >> On Wed, O

Re: [VOTE] Endpoint for refreshing vended credentials

2024-10-21 Thread Daniel Weeks
+1 On Mon, Oct 21, 2024 at 3:07 AM Eduard Tudenhöfner wrote: > Hey everyone, > > I'd like to vote on #11281 , > which introduces a new endpoint and allows > retrieving/refreshing vended credentials for a given table. > > Please vote +1 if you general

Re: [DISCUSS] Discrepancy Between Iceberg Spec and Java Implementation for Snapshot summary's 'operation' key

2024-10-17 Thread Daniel Weeks
clarification Daniel, and thank you Kevin for raising >> this issue! >> >> Does that mean that we are creating component schemas that are the >> superset of the V1 and V2 schemas? And if so, should we remove summary and >> manifest-list from the required properties, an

Re: [Discuss] Iceberg View Interoperability

2024-10-17 Thread Daniel Weeks
Hey Ajantha, I think it's good to figure out a path forward for extending view support, but I'm not convinced using a procedure is a good idea or really moves things forward in that direction. As you already indicated, there are a number of different libraries to translate views, but of the vario

Re: [DISCUSS] Remove iceberg-pig module ?

2024-10-17 Thread Daniel Weeks
+1 for deprecating and dropping On Thu, Oct 17, 2024 at 7:46 AM Eduard Tudenhöfner wrote: > +1 for marking the project deprecated (in 1.7.0) and dropping it in the > next release (1.8.0) > > On Thu, Oct 17, 2024 at 4:36 PM Russell Spitzer > wrote: > >> +1 (oink) >> >> If anyone really cares ple

Re: [DISCUSS] Discrepancy Between Iceberg Spec and Java Implementation for Snapshot summary's 'operation' key

2024-10-17 Thread Daniel Weeks
I'm not convinced this is incorrect behavior (table spec or implementation), but it does lend to some confusion. The 'summary' field is optional, which means that if a summary is not provided, you do not have an associated 'operation' field. The 'operation' field is only required in the context o

Re: Spec changes for deletion vectors

2024-10-16 Thread Daniel Weeks
Hey Everyone, I feel like at this point we've articulated all of the various options and paths forward, but this really just comes down to a matter of whether we want to make a concession here for the purpose of compatibility. If we were building this with no prior art, I would expect to omit the

Re: [VOTE] Standardize vended credentials in OpenAPI spec

2024-10-15 Thread Daniel Weeks
+1 On Tue, Oct 15, 2024 at 10:42 AM Russell Spitzer wrote: > +1 > > On Tue, Oct 15, 2024 at 12:28 PM Bryan Keller wrote: > >> +1 >> >> On Oct 15, 2024, at 10:14 AM, Eduard Tudenhöfner < >> etudenhoef...@apache.org> wrote: >> >> Hey everyone, >> >> I'd like to vote on #10722

Re: Iceberg View Spec Improvements

2024-10-10 Thread Daniel Weeks
the engine? Doesn't this mean the client will have to >> tell the Catalog what its local name is? >> >> On Thu, Oct 10, 2024 at 5:34 PM Daniel Weeks wrote: >> >>> Hey Walaa, >>> >>> I recognize the issue you're calling out but disagree ther

Re: Iceberg View Spec Improvements

2024-10-10 Thread Daniel Weeks
Hey Walaa, I recognize the issue you're calling out but disagree there is an implicit assumption in the spec. The spec clearly says how identifiers including catalogs and namespaces are represented/stored and how references need to be resolved. The idea that a catalog may not match is an environ

Re: [VOTE] Table V3 Spec: Row Lineage

2024-10-10 Thread Daniel Weeks
+1 Thanks Russell! On Thu, Oct 10, 2024 at 6:57 AM Eduard Tudenhöfner wrote: > I left a few comments on the proposal but I'm overall +1 on the proposal > > On Thu, Oct 10, 2024 at 12:08 PM Jean-Baptiste Onofré > wrote: > >> +1 >> >> I did a review on the proposal and it looks good to me. >> >>

Re: [PROPOSAL] Partially Loading Metadata - LoadTable V2

2024-10-10 Thread Daniel Weeks
Hey Haizhou, I think you've done a great job of capturing some of the metadata size related issues in the doc, but I would echo Eduard's comments that we should explore using the existing refs only loading first. This may require adding similar functionality for schemas/logs if we think that is a

Re: Iceberg View Spec Improvements

2024-10-10 Thread Daniel Weeks
Walaa, I just want to expand upon what Ryan said a little. The catalog naming issue was identified when we designed the view spec and we opted for simplicity as opposed to trying to solve for catalog name mapping as it really complicates the spec/implementation. There may be ways for implementat

Re: [Discuss] Iceberg community maintaining the docker images

2024-10-10 Thread Daniel Weeks
I think we should focus on the docker image for the test REST Catalog implementation. This is somewhat different from the TCK since it's used by the python/rust/go projects for testing the client side of the REST specification. As for the quickstart/example type images, I'm open to discussing wha

Re: [VOTE] Table v3 spec: Add unknown and new type promotion

2024-09-30 Thread Daniel Weeks
+1 (binding) On Fri, Sep 27, 2024 at 2:41 PM Russell Spitzer wrote: > +1 (binding) > > On Fri, Sep 27, 2024 at 4:37 PM rdb...@gmail.com wrote: > >> Hi everyone, >> >> I'd like to vote on PR #10955 >> that has been open for a >> while with the chang

Re: [VOTE] Drop Python3.8 Support in PyIceberg 0.8.0

2024-09-23 Thread Daniel Weeks
+1 On Mon, Sep 23, 2024 at 1:48 PM Ameena Ansari wrote: > +1 > > On Mon, 23 Sept 2024 at 13:33, rdb...@gmail.com wrote: > >> +1 >> >> On Mon, Sep 23, 2024 at 10:31 AM Steven Wu wrote: >> >>> +1 (binding). makes sense. >>> >>> On Mon, Sep 23, 2024 at 9:38 AM Yufei Gu wrote: >>> +1 Thanks

Re: [VOTE] Merge REST Spec Change To Add New Scan Planning APIs

2024-09-05 Thread Daniel Weeks
+1 (binding) On Tue, Sep 3, 2024 at 1:22 PM rdb...@gmail.com wrote: > +1 > > I think it would be good to give an overview of the current proposal since > it has evolved quite a bit from the original like Jack said. > > On Tue, Sep 3, 2024 at 9:09 AM Jack Ye wrote: > >> Thanks for keeping pushin

Re: [DISCUSS] Iceberg Materialzied Views

2024-09-03 Thread Daniel Weeks
I'm generally in favor of approach #1 with UUID in lineage. I think it's helpful to know if the underlying table changes (e.g. identifier remains the same, but the table was changed). However, I'm not sure what the behavior would be in that case. Any refresh at that point would not be able to pr

Re: [VOTE] Merge REST Spec change to add RemovePartitionSpecsUpdate update type

2024-08-29 Thread Daniel Weeks
+1 (binding) On Wed, Aug 28, 2024 at 8:33 AM Jack Ye wrote: > +1 (binding) > > On Tue, Aug 27, 2024 at 5:21 AM roryqi wrote: > >> +1 >> >> Manu Zhang 于2024年8月27日周二 11:44写道: >> >>> +1 (non-binding) >>> >>> On Tue, Aug 27, 2024 at 11:00 AM xianjin wrote: >>> +1 (non-binding) Sent from

Re: Type promotion in v3

2024-08-20 Thread Daniel Weeks
I'm pretty strongly opposed to the idea of assigning new field ids as part of type promotion. I understand what we're trying to accomplish, but I just don't think that's the right mechanism to achieve it. The field id specifically identifies the field and shouldn't change as attributes change (na

Re: [VOTE] REST Endpoint discovery

2024-08-20 Thread Daniel Weeks
+1 On Tue, Aug 20, 2024 at 11:19 AM Yufei Gu wrote: > +1 > > Yufei > > > On Tue, Aug 20, 2024 at 11:16 AM Eduard Tudenhöfner < > etudenhoef...@apache.org> wrote: > >> Hey everyone, >> >> I'd like to vote on PR #10928 >> which adds a way for REST >>

Re: [VOTE] Spec changes in preparation for v3

2024-08-19 Thread Daniel Weeks
+1 (binding) On Mon, Aug 19, 2024, 4:11 PM Steven Wu wrote: > +1 (binding) > > On Mon, Aug 19, 2024 at 4:06 PM Anton Okolnychyi > wrote: > >> +1 (binding) >> >> - Anton >> >> пн, 19 серп. 2024 р. о 13:49 John Zhuge пише: >> >>> +1 (non-binding) >>> >>> On Mon, Aug 19, 2024 at 1:34 PM Yufei Gu

Re: [VOTE] Release Apache PyIceberg 0.7.1rc2

2024-08-16 Thread Daniel Weeks
nstallation > requirements as well. > > Here's a PR with an attempt to fix the this issue, and the issue with > docutils (0.21.post1): https://github.com/apache/iceberg-python/pull/1067 > > On Thu, Aug 15, 2024 at 5:35 PM Daniel Weeks wrote: > >> I ran into a coup

Re: [VOTE] Release Apache PyIceberg 0.7.1rc2

2024-08-15 Thread Daniel Weeks
I ran into a couple issues while trying to verify the release. The first appears to be a transient issue (we ran into something similar in the 0.6.1 release but I was able to install later). Package docutils (0.21.post1) not found. make: *** [install-dependencies] Error 1 The second issue is mor

Re: [DISCUSS] Variant Spec Location

2024-08-15 Thread Daniel Weeks
; >>> On Thu, Aug 15, 2024 at 10:07 AM Jack Ye wrote: >>> >>> +1 for copying the spec into our repository, I think we need to own it >>> fully as a part of the table spec, and we can build compatibility through >>> tests. >>> >>> -Jack &g

Re: [DISCUSS] Variant Spec Location

2024-08-14 Thread Daniel Weeks
I'm really excited about the introduction of variant type to Iceberg, but I want to raise concerns about forking the spec. I feel like preemptively forking would create the situation where we end up diverging because there's little reason to work with both communities to evolve in a way that benef

Re: [VOTE] Merge REST spec clarification on how servers should handle unknown updates/requirements

2024-08-14 Thread Daniel Weeks
+1 On Tue, Aug 13, 2024 at 7:34 PM xianjin wrote: > +1 > > On Aug 14, 2024, at 2:24 AM, Ryan Blue > wrote: > >  > +1 > > On Tue, Aug 13, 2024 at 8:59 AM Yufei Gu wrote: > >> +1 >> Yufei >> >> >> On Tue, Aug 13, 2024 at 8:57 AM Eduard Tudenhöfner < >> etudenhoef...@apache.org> wrote: >> >>> +1

Re: [DISCUSS] Changing namespace separator in REST spec

2024-08-05 Thread Daniel Weeks
I would agree with adding either a server side (config override) or client side control (query param with `?delim=.`) as it will be compatible with the current v1 endpoint. In the future we could introduce a v2 endpoint(s), but I would want to wait for OpenAPI 4 because they address this by allowi

Re: [DISCUSS] Clarify in REST spec expected implementation behavior for unknown updates or requirements

2024-08-05 Thread Daniel Weeks
I feel like this is a little bit of a gray area in terms of 400 vs 422. While I agree that 422 reads like the right answer just based on the definition of the codes, I think that it will be hard to implement and may not make sense in context of how the server evolves. If a server has not implement

Re: [VOTE] Merge specification clarifications on reading/writing partition values

2024-08-05 Thread Daniel Weeks
+1 (binding) On Fri, Aug 2, 2024 at 1:25 PM Ryan Blue wrote: > +1 (binding) > > On Fri, Aug 2, 2024 at 12:03 PM Yufei Gu wrote: > >> +1 (binding) >> Yufei >> >> >> On Fri, Aug 2, 2024 at 11:18 AM Prashant Singh >> wrote: >> >>> +1 (non-binding) >>> Thanks Micah ! >>> >>> Regards, >>> Prashant

Re: [VOTE] Clarify "File System Tables" in the table spec

2024-08-01 Thread Daniel Weeks
Added comments to the PR to include a target removal version and appropriate alternative messaging. +1 (binding) On Thu, Aug 1, 2024 at 8:24 AM Jack Ye wrote: > +1 (binding) > > -Jack > > On Thu, Aug 1, 2024 at 6:30 AM Russell Spitzer > wrote: > >> +1 (Binding) >> >> On Thu, Aug 1, 2024 at 7:3

Re: [VOTE] Drop Java 8 support in Iceberg 1.7.0

2024-07-31 Thread Daniel Weeks
+1 (binding) On Sun, Jul 28, 2024 at 11:35 PM Huang-Hsiang Cheng wrote: > +1 (non-binding) > > Thanks, > Huang-Hsiang > > On Jul 27, 2024, at 12:42 AM, Steve Zhang > wrote: > > +1 (non-binding) > > Thanks, > Steve Zhang > > > > On Jul 26, 2024, at 9:15 AM, Amogh Jahagirdar <2am...@gmail.com> wr

Re: [ANNOUNCE] Welcoming new committers and PMC members

2024-07-23 Thread Daniel Weeks
Congrats everyone! Thanks for all the great work and looking forward to more. -Dan On Tue, Jul 23, 2024 at 9:20 AM Anton Okolnychyi wrote: > Congrats everyone! > > вт, 23 лип. 2024 р. о 09:11 Yufei Gu пише: > >> Congratulations! Thanks a lot for the contribution! >> >> Yufei >> >> >> On Tue,

Re: Building with JDK 21

2024-07-19 Thread Daniel Weeks
I'm also in favor of removing Java 8 support. Hive docs state Hive 3 requires java 8 and in prior cases there were potential correctness issues when running with newer Java versions (these may have been addressed). As long as we're not upda

Re: [VOTE] Release Apache Iceberg 1.6.0 RC1

2024-07-19 Thread Daniel Weeks
+1 (binding) Verified sigs/sums/license/build/test (Java 17) -Dan On Fri, Jul 19, 2024 at 2:53 AM Robert Stupp wrote: > +1 (nb) > > On 18.07.24 08:37, Jean-Baptiste Onofré wrote: > > Hi everyone, > > > > I propose that we release the following RC as the official Apache > > Iceberg 1.6.0 releas

Re: [VOTE] Merge table spec clarifications on time travel and equality deletes

2024-07-19 Thread Daniel Weeks
+1 (binding) Thanks, Micah. On Thu, Jul 18, 2024 at 8:29 PM Amogh Jahagirdar <2am...@gmail.com> wrote: > +1 (non-binding) on these spec clarifications > > Thanks, > Amogh Jahagirdar > > On Thu, Jul 18, 2024 at 5:08 PM Steven Wu wrote: > >> I am +1 for the spec clarifications. >> >> I have left

Re: [DISCUSS] Deprecate HadoopTableOperations, move to tests in 2.0

2024-07-17 Thread Daniel Weeks
In Java, I think you're looking for the 'Tables' interface and 'HadoopTables' implementation for just directly loading a table from a location. On Wed, Jul 17, 2024 at 8:48 PM Renjie Liu wrote: > I think there are two ways to do this: > 1. As Xuanwo said, we refactor HadoopCatalog to be read on

  1   2   3   >