Re: [DISCUSS] Iceberg Community Guidelines

2023-03-17 Thread russell . spitzer
Looks good to me , although for recruiting I think maybe we should have a dedicated jobs channel?Sent from my iPhoneOn Mar 17, 2023, at 12:46 PM, Daniel Weeks wrote:Hey everyone,With the increasing level of activity in the Iceberg channels, I feel now is a good opportunity to establish some guide

Re: [VOTE] Release Apache Iceberg 1.2.0 RC1

2023-03-20 Thread Russell Spitzer
+1 (binding) - Validated sig, license, checksum, rat - ran full test suite - Basic testing on Spark 3.3 > On Mar 19, 2023, at 11:20 PM, Manu Zhang wrote: > > +1 (non-binding) > > Build from source and run quick start queries > using

[Discuss] Allow all users who have Committed to the project to run CI without Approval

2023-03-29 Thread Russell Spitzer
Recent moves by Apache Infra have changed the policy on github actions from "Only requires approval first time" to "Requires approval every time". I think this is a big step backwards in terms of getting folks involved in the project and in terms of the amount of committer busy work required to

Re: Welcome new PMC members!

2023-04-11 Thread Russell Spitzer
Great news, Congratulations to all! > On Apr 11, 2023, at 5:11 PM, Dmitri Bourlatchkov > wrote: > > Congratulations Fokko, Steven, and Yufei! > > On Tue, Apr 11, 2023 at 5:22 PM Ryan Blue > wrote: > Hi everyone! > > I want to congratulate 3 new PMC members, Fokko Drie

Re: [DISCUSS] Dropping Spark 2.4 support

2023-04-14 Thread Russell Spitzer
+1, Spark 2.4 is very out of sync with current developments as noted above. It's almost impossible for us to get any newer features to be compatible with it. On Fri, Apr 14, 2023 at 4:52 PM Anjali Norwood wrote: > Hi Fokko, Ryan, > > Netflix is still on Spark-2.4.4 with Iceberg-0.9. We are > act

Re: [DISCUSS] Spark 3.1 support?

2023-04-22 Thread russell . spitzer
If you are on forked 0.13 is it important to keep these changes in master?Sent from my iPhoneOn Apr 22, 2023, at 8:42 PM, Manu Zhang wrote:I'd like to share our maintenance strategy and history at eBay.We are now on forked versions of Iceberg 0.13.1 and Spark 3.1.1. For Spark, We started to evalu

Re: [DISCUSS] Spark 3.1 support?

2023-04-22 Thread russell . spitzer
If you are already back-porting patches to a different branch which isn’t receiving other fixes anyway, why does it help to keep 3.1 support in master? If we kept it there, we would be doing so for users who want a 1.2+ Iceberg version with 3.1 support. That doesn’t sound like your use case. I’m no

Re: Why is sort required for Spark writing to partitioned table

2023-04-25 Thread Russell Spitzer
s in SPARK-23889 >> <https://issues.apache.org/jira/browse/SPARK-23889> are resolved in >> spark-3.2.0. Does that mean we don't need explicit sort anymore from >> spark-3.2.0 and after? >> >> Thanks >> >> On Tue, Mar 7, 2023 at 8:10 PM Russell

Re: Welcome new committers and PMC!

2023-05-03 Thread Russell Spitzer
Great news! It's so exciting to have the project continue to grow! > On May 3, 2023, at 2:06 PM, Ryan Blue wrote: > > Hi everyone, > > I want to congratulate Amogh and Eduard, who were just added as Ierberg > committers and Szehon, who was just added to the PMC. Thanks for all your > contribu

Re: Support create table like for Iceberg table?

2023-05-09 Thread Russell Spitzer
How would Create Table Like, be different than our "Snapshot" procedure, just enabled for Iceberg Tables? Wondering if we should just expand that functionality. On Tue, May 9, 2023 at 11:54 AM Pucheng Yang wrote: > Ryan, when I mentioned "copy of the data'', I didn't mean to > physically copy th

Re: Slack invitation

2023-05-10 Thread Russell Spitzer
Does this link no longer work? https://iceberg.apache.org/community/#slack > On May 10, 2023, at 12:58 AM, Thijs van de Poll > wrote: > > Hi, > > I would like to get access to the Apache Iceberg community on Slack if that > would be possible. Would you mind sending an invitation to this ema

Re: Slack invitation

2023-05-10 Thread Russell Spitzer
t have an @apache.org <http://apache.org/> email address? > Contact the workspace administrator at apache-iceberg for an invitation. > > On Wed, May 10, 2023 at 9:13 PM Russell Spitzer <mailto:russell.spit...@gmail.com>> wrote: > Does this link no longer work? > > ht

Re: Slack invitation

2023-05-10 Thread Russell Spitzer
: > > I tried that its not working for me, I updated below ss to check as i am not > doing anything wrong > > > when i tried to login with google, it says > > > On Wed, May 10, 2023 at 9:50 PM Russell Spitzer <mailto:russell.spit...@gmail.com>

Re: Scan statistics

2023-05-15 Thread Russell Spitzer
I think currently the recommendation would be to filter the iterator rather than pulling the whole object with stat's into memory. Is there a requirement that all of the DataFiles be pulled into memory before filtering? On Mon, May 15, 2023 at 9:49 AM Péter Váry wrote: > Hi Team, > > We have a F

Re: Scan statistics

2023-05-22 Thread Russell Spitzer
gt;>>> When the Flink job was able to access the Catalog again then it fetched >>>> all the data from the table which arrived during the downtime. Since the >>>> planning does not guarantee the order of the Tasks we ended up out of order >>>> records whic

Re: Copyonwrite scan

2023-05-24 Thread russell . spitzer
Could you include the exception you are seeing? Sent from my iPhone > On May 23, 2023, at 9:13 PM, Gaurav Agarwal wrote: > >  > Hi > > We are getting > " runtime file filtering exception the table has been concurrently modified > row level operation scan snapshot id " > > This exception we

Re: Iceberg transaction support with spark sql

2023-05-25 Thread Russell Spitzer
We also have branch and merge support, which may be a bit easier to use > On May 25, 2023, at 4:33 PM, Anton Okolnychyi > wrote: > > Unfortunately, Spark SQL does not have an API for transactions. However, you > may use Iceberg WAP to stage multiple changes and commit them as a single > Icebe

Re: 👋 Intro and question for the community

2023-05-30 Thread Russell Spitzer
Could you please elaborate on what Common room really is and why it needs special permissions? I'm would have thought just generic public access would be enough to check PR's, Issues and such? On Tue, May 23, 2023 at 7:17 PM Anton Okolnychyi wrote: > Seems valuable to me. > > - Anton > > On May

Re: 👋 Intro and question for the community

2023-05-30 Thread Russell Spitzer
munities or I'd be happy to bring > anyone who is interested on to a phone call with Common Room to ask any > other quesitons. > > Let me know what you think. > > > > On Tue, May 30, 2023 at 10:54 AM Russell Spitzer < > russell.spit...@gmail.com> wrote: > >&g

Re: How to use Apache Iceberg cli

2023-06-01 Thread russell . spitzer
Iceberg does not have its own cli or a web ui. The spark-shell (spark cli) or Trino are usually recommended for early testing. The instructions for interacting are in the other spark doc pages where you found the getting started docs. Sent from my iPhone > On Jun 1, 2023, at 9:40 AM, Arvind Di

Re: Iceberg old partition gc

2023-06-02 Thread Russell Spitzer
I think "soft-mode" is really just doing the delete. You can then recover the snapshot if you happen to have accidentally TTL'd a partition. On Fri, Jun 2, 2023 at 8:51 AM Szehon Ho wrote: > I think this violates Iceberg’s assumption of immutable snapshots. That > would require modifying the ol

Re: How to remove an Iceberg partition that only contains parquet files with 0 record

2023-06-30 Thread russell . spitzer
You probably will need to manually delete the file entry using the table api from JavaSent from my iPhoneOn Jun 30, 2023, at 6:58 AM, Pucheng Yang wrote:Hi Manu, the table has already been migrated to Iceberg and I think your command only available to Hive table. It seems won’t help my case. Appr

Re: Ad-hoc partition bucketing

2023-07-05 Thread russell . spitzer
We have been discussing something like this as well, either an arbitrary partitioning scheme or just a more extensive and customizable transform. An example I’m interested in is a geo hash index where we store offsets on a large grid to denote partitions. The total offset file for the whole plan

Re: iceberg and s3a compatibility

2023-07-11 Thread russell . spitzer
The long story short is that Iceberg itself is a commit protocol. So you don’t have to configure any Hadoop commit protocols. Iceberg doesn’t use those methods because its metadata structure doesn’t rely on the location of data files as information about the state of those files. It can just wri

Re: Iceberg docs pull requests

2023-07-14 Thread Russell Spitzer
One of the issues is we kind of have a dual repo doc process. Most doc changes that are versioned are made in the main oss apache repo and they are copied over when Iceberg is released. So changes against the doc repo are only for fixing past docs or non-versioned pages. I'll take a quick look at

Re: Slack invitation link

2023-07-17 Thread Russell Spitzer
You shouldn't need one, does this link work? They only allow a certain number of joins, once they hit that we have to add a new link. But they don't tell us when the limit is hit On Mon, Jul 17, 2023 at 1:50 PM Jacob Marble wrote: > Good morning, > > I'm looking for an invitation to join the Apa

Re: Slack invitation link

2023-07-17 Thread Russell Spitzer
https://iceberg.apache.org/community/#slack Sorry Forgot the link On Mon, Jul 17, 2023 at 2:05 PM Russell Spitzer wrote: > You shouldn't need one, does this link work? > They only allow a certain number of joins, once they hit that we have to > add a new link. But they don'

Re: [PROPOSAL] Preparing first Apache Iceberg Summit

2023-07-19 Thread Russell Spitzer
I would love to be involved if possible. I'm a bit short on time though but can definitely contribute async time to planning. On Wed, Jul 19, 2023 at 9:35 AM Jean-Baptiste Onofré wrote: > Hi guys, > > Following the previous email about Apache Iceberg Summit, please find > a document introducing

Re: Location of rust repo

2023-07-19 Thread Russell Spitzer
+1, If the folks working on Rust want it in the main repo I have no issues with that but it should be their choice :) On Wed, Jul 19, 2023 at 12:47 PM Ryan Blue wrote: > I don't have a strong opinion here. I'd probably lean toward having it in > the main repo to get more eyes on the PRs, but I t

Re: Broken slack invite

2023-07-24 Thread Russell Spitzer
https://github.com/apache/iceberg-docs/commit/a42abbf9e7cda62ac4d94943599e840d4342d6c5 It was just updated but I don't think the docs have been republished yet > On Jul 23, 2023, at 7:34 AM, Bruno Murino w

Re: [PROPOSAL] Expose Apache Iceberg Slack data for hacking/education for the community

2023-08-08 Thread Russell Spitzer
I'm +1 as long as Slack TOS are ok with it. We already have full public archives of the mailing list and I see slack as just an extension of the mailing list. On Tue, Aug 8, 2023 at 4:18 PM Brian Olsen wrote: > Hey Iceberg Nation, > > I wanted to propose having the public Apache Iceberg Slack >

Re: Behavior of dropping table with HadoopCatalog

2023-08-30 Thread russell . spitzer
There is no way to drop a Hadoop catalog table without removing the directory so I’m not sure what the alternative would beSent from my iPhoneOn Aug 29, 2023, at 10:10 PM, Manu Zhang wrote:Hi all,The current behavior of dropping a table with HadoopCatalog looks inconsistent to me. When stored at

Re: Proposal: Introduce deletion vector file to reduce write amplification

2023-10-09 Thread Russell Spitzer
The main things I’m still interested are alternative approaches. I think that some of the work that Anton is working on have shown some different bottlenecks in applying delete files that I’m not sure are addressed by this proposal.For example, this proposal suggests doing a 1 to 1 (or 1 rowgroup t

Re: [PROPOSAL] Use Microsoft Style Guide for documentation

2023-11-02 Thread Russell Spitzer
+1 > On Nov 1, 2023, at 6:13 PM, Yufei Gu wrote: > > +1 Love the following example. Not sure if Vale can catch this and provide > suggestions. It may be only possible with LLM. >> Replace this: If you're ready to purchase Office 365 for your organization, >> contact your Microsoft account repr

Re: [Discussion] Move `iceberg-parquet` and `iceberg-orc` modules into `iceberg-core`

2023-11-02 Thread Russell Spitzer
Is there an alternative where we do an implementation similar to how Position Deletes and Data Files are currently written? Like we have the more generic "writers" in core but the actual implementations still live in iceberg-parquet or iceberg-orc? > On Nov 2, 2023, at 9:38 AM, Ajantha Bhat wr

Re: [VOTE] Release Apache Iceberg 1.4.2 RC0

2023-11-02 Thread Russell Spitzer
+1 - Checked all the normal things (Checksum, Tests, Rat) > On Nov 2, 2023, at 12:38 PM, Ryan Blue wrote: > > +1 > > Thanks for getting this fix out, Amogh! > > On Thu, Nov 2, 2023 at 9:19 AM Amogh Jahagirdar > wrote: >> Thanks Ajantha, I've reached out to a few more

Re: Add me to slack channel

2023-11-07 Thread Russell Spitzer
There is a self invite link https://iceberg.apache.org/community/#slack < Here but if it is no longer working let us know, we have to renew it every few hundred invitees On Sun, Nov 5, 2023 at 1:02 PM Sardar Khan wrote: > Hi, > I have a few questions regards to the new deltalakemigration method

Re: Is there a way to distcp iceberg table from hadoop?

2023-12-04 Thread Russell Spitzer
Delta now exposes this functionality as a command, and some groups (like ours) have some internal functionality for doing this. I think it's worth reconsidering this as a first class procedure in the Iceberg-Spark module since we get a lot of requests about it and now position deletes are a bit mor

Re: Iceberg Logo Fix and Iceberg Swag Shop

2023-12-06 Thread Russell Spitzer
The original email has a broken png link so I was never able to see the issue, could you attach the before and after so I can see the difference? > On Dec 6, 2023, at 9:07 AM, Brian Olsen wrote: > > Hey all, > > I wanted to resurface this and see if any PMC could take a look. Thanks! > > On W

Re: Iceberg Logo Fix and Iceberg Swag Shop

2023-12-06 Thread Russell Spitzer
the only public images I know of. Let me know if there are any > issues viewing them. > > On Wed, Dec 6, 2023 at 9:53 AM Weston Pace <mailto:weston.p...@gmail.com>> wrote: >> BTW: ASF mailing lists strip attachments and so you will need to use a gist >> or some o

Re: [DISCUSS] Run GC with Catalog or Tables

2023-12-06 Thread Russell Spitzer
I just think this is a bit more complicated than I want to take into the main library just because we have to make decisions about 1. Retries 2. Concurrency 3. Results/Error Reporting But if we have a good proposal for we will handle all those I think we could do it? > On Dec 6, 2023, at 2:05

Re: [PROPOSAL] Improvement on our PR flows

2024-01-03 Thread Russell Spitzer
I definitely need something to keep emailing me, so I support this. On Wed, Jan 3, 2024 at 7:52 AM Jean-Baptiste Onofré wrote: > Hi guys, > > We have several examples where we have some kind of "stale" PRs, > either because we are waiting for a review, or we are waiting for > changes from the c

Re: [DISCUSS] Iceberg community summit

2024-01-12 Thread Russell Spitzer
I'd also like to volunteer > On Jan 12, 2024, at 12:27 PM, Brian Olsen wrote: > > Hey Iceberg nation, > > I would like to volunteer to be on the selection committee. I have a lot of > experience from my time working on the Trino Community. I helped run the > Trino Summit’s in 2021() and 2022

Re: Partition column order in rewrite manifests

2024-01-30 Thread russell . spitzer
Sounds like a reasonable thing to add? Maybe we could check cardinality to pick out the default order as well?Sent from my iPhoneOn Jan 30, 2024, at 3:50 PM, Jack Ye wrote:Hi everyone,Today, the rewrite manifest procedure always orders the data files based on their data_file.partition value. Spec

Re: New committer: Bryan Keller

2024-03-05 Thread Russell Spitzer
Congratulations! > On Mar 5, 2024, at 10:04 AM, Chris Ward wrote: > > Congrats Bryan! > > From: Steve Zhang > Date: Tuesday, March 5, 2024 at 9:59 AM > To: dev@iceberg.apache.org > Subject: Re: New committer: Bryan Keller > > Congrats Bryan, well deserved! > > Thanks, > Steve Zhang > >

Re: [VOTE] Release Apache Iceberg 1.5.0 RC6

2024-03-11 Thread Russell Spitzer
+1 (binding) Verified all the usual things Ran full test suite Everything looking good > On Mar 8, 2024, at 11:35 PM, Szehon Ho wrote: > > +1 (binding) > > * Verified signature > * Verified checksum > * RAT check > * built JDK 11 > * Ran basic tests on Spark 3.5 > > Thanks > Szehon > > On Fr

Re: [Proposal] Add support for Materialized Views in Iceberg

2024-04-25 Thread Russell Spitzer
+1 to separate. > On Apr 25, 2024, at 2:08 PM, Jean-Baptiste Onofré wrote: > > +1 to separate, it makes sense to me. > > Regards > JB > > On Thu, Apr 18, 2024 at 11:50 AM Walaa Eldin Moustafa > wrote: >> >> Hi everyone, >> >> I would like to make a proposal for issue [1] to support material

Re: [VOTE] Release Apache Iceberg 1.5.2 RC0

2024-05-06 Thread Russell Spitzer
+1 (binding) Checked all the normal things Tests Rat Checksum Internal Tests > On May 6, 2024, at 5:35 AM, Cheng Pan wrote: > > +1 (non-binding) > > Integrated with Apache Kyuubi[1], the CI covers > > - Iceberg Spark 3.3-3.5 with Scala 2.12 > - Iceberg Spark 3.5 with Scala 2.13 > > [1] https

Re: [Proposal] Add support for Flink Maintenance in Iceberg

2024-05-06 Thread Russell Spitzer
+1 I'm mostly in favor of the single pipeline model but I don't see any issue with supporting both models. > On May 6, 2024, at 1:43 PM, Rodrigo Meneses wrote: > > +1 > Thanks so much for driving this Peter! > > On Fri, May 3, 2024 at 11:30 AM Péter Váry >

Re: Summary of Iceberg Materialized View Meeting

2024-06-06 Thread russell . spitzer
Thanks for hosting it was a very helpful meeting. I really hope we can do more in the future to accelerate consensus on other proposals. I do encourage anyone on the mailing list to add your comments offline as well, especially if you have strong feelings. Iceberg is an open project and we realize

Re: Feedback Collection: Bylaws in Iceberg

2024-06-25 Thread Russell Spitzer
Thanks for bringing this up Jack. I think having more established rules specifically for the project is probably a good thing to make sure outsiders see a path to becoming more included in the project. I'm especially interested in the proposals for more actively including newer contributors from d

Re: Building with JDK 21

2024-07-09 Thread Russell Spitzer
The different formatting preferences sounds annoying enough that I would think we should just drop the Java8 support. Do we have anyone who strongly prefers keeping Java 8 support? As an alternative I think it would be fine if we disable the formatter when using Java 21 and just make sure we alway

Re: [DISCUSS] Formalized File IO Properties

2024-07-10 Thread Russell Spitzer
Sounds reasonable to me On Wed, Jul 10, 2024 at 9:28 AM Renjie Liu wrote: > Hi: > > +1 for standardizing iceberg properties. This will help to align different > language implementations. > > On Wed, Jul 10, 2024 at 9:44 PM wrote: > >> Hello Everyone, >> >> I was considering discussing the stand

Re: [VOTE] Fix property names in REST spec for statistics / partition statistics

2024-07-10 Thread Russell Spitzer
+1 On Wed, Jul 10, 2024 at 9:47 AM Amogh Jahagirdar <2am...@gmail.com> wrote: > +1 (non-binding) > > On Wed, Jul 10, 2024 at 7:16 AM Piotr Findeisen > wrote: > >> +1 (non binding) >> >> On Wed, 10 Jul 2024 at 10:11, Jean-Baptiste Onofré >> wrote: >> >>> +1 (non binding) >>> >>> Regards >>> JB >

Re: [Vote] Deprecate oauth tokens endpoint

2024-07-10 Thread Russell Spitzer
+1 On Wed, Jul 10, 2024 at 11:03 AM Russell Spitzer wrote: > +` > > On Wed, Jul 10, 2024 at 9:33 AM Renjie Liu > wrote: > >> +1 (non binding) >> >> On Wed, Jul 10, 2024 at 4:35 PM roryqi wrote: >> >>> +1. >>> >>> Driesprong, F

Re: [Vote] Deprecate oauth tokens endpoint

2024-07-10 Thread Russell Spitzer
+` On Wed, Jul 10, 2024 at 9:33 AM Renjie Liu wrote: > +1 (non binding) > > On Wed, Jul 10, 2024 at 4:35 PM roryqi wrote: > >> +1. >> >> Driesprong, Fokko 于2024年7月10日周三 16:29写道: >> >>> +1 (binding) >>> >>> Op wo 10 jul 2024 om 10:14 schreef Jean-Baptiste Onofré >> >: >>> +1 (non binding)

Re: [DISCUSS] Enable the discussion tab for iceberg github repos

2024-07-10 Thread Russell Spitzer
I'm a fan of having more things on github if possible. I haven't used this feature but it sounds like it could be useful. On Wed, Jul 10, 2024 at 6:15 AM Renjie Liu wrote: > I’m fine with enabling it in iceberg-rust first and see how it goes. > > On Wed, Jul 10, 2024 at 17:39 Fokko Driesprong w

Re: [Early Feedback] Variant and Subcolumnarization Support

2024-07-12 Thread Russell Spitzer
Feels like eventually the encoding should land in parquet proper right? I'm fine with us just copying into Iceberg though for the time being. On Fri, Jul 12, 2024 at 2:31 PM Ryan Blue wrote: > Oops, it looks like I missed where Aihua brought this up in his last email: > > > do we have an issue t

Re: [Early Feedback] Variant and Subcolumnarization Support

2024-07-12 Thread Russell Spitzer
ke a standalone module from it? > > On Fri, Jul 12, 2024 at 12:38 PM Russell Spitzer < > russell.spit...@gmail.com> wrote: > >> Feels like eventually the encoding should land in parquet proper right? >> I'm fine with us just copying into Iceberg though for the time

Re: [Early Feedback] Variant and Subcolumnarization Support

2024-07-12 Thread Russell Spitzer
s a priority. > > Should we move forward by starting a draft of the changes to the table > spec? Then we can vote on committing those changes and get moving on an > implementation (or possibly do the implementation in parallel). > > On Fri, Jul 12, 2024 at 1:08 PM Russell Spitzer >

Re: [VOTE] Release Apache Iceberg 1.6.0 RC0

2024-07-12 Thread Russell Spitzer
+1 - Checked all the normal thing (Rat, Tests, Build, Spark) On Fri, Jul 12, 2024 at 1:14 PM Dmitri Bourlatchkov wrote: > +1 (nb) > > I verified OAuth2 in the REST Catalog with Spark / Keycloak (client > secret) / Nessie. > > The token URI warning is prominently displayed, when `oauth2-server-ur

Re: [Early Feedback] Variant and Subcolumnarization Support

2024-07-12 Thread Russell Spitzer
r didn't seem to object > to the encoding from what I read of his comments. Hopefully he (and others) > chime in here. > > On Fri, Jul 12, 2024 at 1:32 PM Russell Spitzer > wrote: > >> I just want to make sure we get Piotr and Peter on board as >> representatives

Re: [VOTE] Release Apache Iceberg 1.6.0 RC0

2024-07-17 Thread Russell Spitzer
I'm in for RC1, -1 Vote for RC0 On Wed, Jul 17, 2024 at 3:13 PM Jean-Baptiste Onofré wrote: > Hi Amogh > > Thanks ! Imho, I would prefer to change/"fix" the > TableMetadata.Builder constructor in 1.6.0. If we release like this, > It will be painful to deprecate and probably a bit confusing. > I

Re: [Early Feedback] Variant and Subcolumnarization Support

2024-07-18 Thread Russell Spitzer
the exact set of types supported is worthwhile (and if the > > > goal is to maintain the same set as specified by the Spark Variant > type or > > > if divergence is expected/allowed). From a fragmentation perspective > it > > > would be a shame if they diverge, so mayb

Re: [RESULT][VOTE] Merge table spec clarifications on time travel and equality deletes

2024-07-19 Thread Russell Spitzer
+1, Sorry I meant to vote before. I just had nits on the wording On Fri, Jul 19, 2024 at 2:04 PM Micah Kornfield wrote: > Hi Dmitri, > Thank you for the comment, maybe we can continue the discussion on the PR > (there are still some other open issues). I don't think the current spec > reference

Re: [ANNOUNCE] Welcoming new committers and PMC members

2024-07-23 Thread Russell Spitzer
This is truly an exciting day. To have to many qualified folks being recognized by the Iceberg project fills me with pride. I can't wait to see what we get done together! On Tue, Jul 23, 2024 at 9:12 AM Sung Yun wrote: > Thank you very much! > > I am excited to see the project growing to new cap

Re: [ANNOUNCE] Welcoming new committers and PMC members

2024-07-23 Thread Russell Spitzer
"so many" :) On Tue, Jul 23, 2024 at 9:14 AM Russell Spitzer wrote: > This is truly an exciting day. To have to many qualified folks being > recognized by the Iceberg project fills me with pride. I can't wait to see > what we get done together! > > On Tue, Ju

Re: [DISCUSS][BYLAWS] Moving forward on the bylaws

2024-07-23 Thread Russell Spitzer
Micah has a great list there for me. I'm similarly not as interested in the bureaucracy of the project and more interested in actually discussing how we operate from a technical perspective as the community grows. On Tue, Jul 23, 2024 at 1:01 AM Micah Kornfield wrote: > My 2 cents on this topic.

Re: Dropping JDK 8 support

2024-07-23 Thread Russell Spitzer
+1 On Tue, Jul 23, 2024 at 11:47 AM huaxin gao wrote: > We don't have to drop JDK11 support. In spark-ci, I can change the matrix > to only run Java 17 for Spark 4.0, but in java-ci, we might not be able to > build java docs and do build checks for JDK 11. > > Huaxin > > > > On Tue, Jul 23, 2024

Re: [DISCUSS] Spec clarifications on reading/writing Identity partitioned columns

2024-07-25 Thread Russell Spitzer
I have no problem with explicitly stating that writing identity source columns is optional on write. We should, of course, mandate surfacing the column on read :) On Thu, Jul 25, 2024 at 1:30 PM Micah Kornfield wrote: > The Table specification doesn't mention anything about requirements for > wh

Re: [VOTE] Drop Java 8 support in Iceberg 1.7.0

2024-07-26 Thread Russell Spitzer
+1 (bind) On Fri, Jul 26, 2024 at 8:34 AM Péter Váry wrote: > +1 (non-binding) > > Ajantha Bhat ezt írta (időpont: 2024. júl. 26., > P, 14:51): > >> +1 >> >> On Fri, Jul 26, 2024 at 5:16 PM Eduard Tudenhöfner < >> etudenhoef...@apache.org> wrote: >> >>> +1 (non-binding) for dropping JDK8 suppor

Re: [DISCUSS] Deprecate HadoopTableOperations, move to tests in 2.0

2024-07-31 Thread Russell Spitzer
My guess would be to avoid complications with multiple committers attempting to swap at the same time. On Wed, Jul 31, 2024 at 9:50 AM Jack Ye wrote: > I see, thank you Fokko, this is a very helpful context. > > Looking at the discussion in the PR and discussions in it, it seems like > the versi

Re: [DISCUSS] adoption of format version 3

2024-07-31 Thread Russell Spitzer
Thanks for bringing this up, I would say that from my perspective I have time to really push through hopefully two things Variant Type and Row Lineage (which I will have a proposal for on the mailing list next week) I'm using the Project to try to track logistics and minutia required for the new

Re: [DISCUSS] adoption of format version 3

2024-07-31 Thread Russell Spitzer
>> [1] https://iceberg.apache.org/spec/#default-values >> [2] https://github.com/apache/iceberg/pull/9502 >> >> Thanks, >> Walaa. >> >> On Wed, Jul 31, 2024 at 10:52 AM Russell Spitzer >> wrote: >> > >> > Thanks for bringing this up,

Re: [VOTE] Clarify "File System Tables" in the table spec

2024-08-01 Thread Russell Spitzer
+1 (Binding) On Thu, Aug 1, 2024 at 7:31 AM Fokko Driesprong wrote: > +1 (binding) > > Op do 1 aug 2024 om 09:57 schreef Eduard Tudenhöfner < > etudenhoef...@apache.org>: > >> +1 (non-binding) >> >> On Thu, Aug 1, 2024 at 6:52 AM Micah Kornfield >> wrote: >> >>> +1 (non-binding) >>> >>> On Wed,

[DISCUSS] Variant Spec Location

2024-08-12 Thread Russell Spitzer
Hi Y’all, We’ve hit a bit of a roadblock with the Variant Proposal, while we were hoping to move the Variant and Shredding specifications from Spark into Iceberg there doesn’t seem to be a lot of interest in that. Unfortunately, I think we have a number of issues with just linking to the Spark pro

Re: [DISCUSS] Changing namespace separator in REST spec

2024-08-13 Thread Russell Spitzer
I feel like we are over complicating this situation. It seems like our specification made a poor choice in terms of separator character, do we have any disagreement on this point? It looks like by choosing a control character, we ended up generating requests which modern server systems define as p

Welcome Péter, Amogh and Eduard to the Apache Iceberg PMC

2024-08-13 Thread Russell Spitzer
this community and thankful for the hard work and stewardship of its members. Thank you for your time, Russell Spitzer

Re: [DISCUSS] Variant Spec Location

2024-08-14 Thread Russell Spitzer
t;>>> Peter >>>> >>>> Aihua Xu ezt írta (időpont: 2024. aug. 13., K, >>>> 19:52): >>>> >>>>> Thanks Russell for bringing this up. >>>>> >>>>> This is the main blocker to move forward with the V

Re: [DISCUSS] Variant Spec Location

2024-08-15 Thread Russell Spitzer
erence implementations from apache/variant-type or implement their >> own. >> >> Best, >> Gang >> >> >> >> >> On Thu, Aug 15, 2024 at 10:07 AM Jack Ye wrote: >> >>> +1 for copying the spec into our repository, I think we need to own it >

Re: [DISCUSS] Variant Spec Location

2024-08-15 Thread Russell Spitzer
make sure that the decision is > publicly documented within both communities. > > Thanks, > Micah > > On Thu, Aug 15, 2024 at 7:47 AM Russell Spitzer > wrote: > >> @Gang Wu >> >> I agree that it would be beneficial to make a sub-project, the main >> prob

Re: [DISCUSS] REST Endpoint discovery

2024-08-15 Thread Russell Spitzer
I'm on board for this proposal. I was in the off-mail chats and I think this is probably our simplest approach going forward. On Thu, Aug 15, 2024 at 10:39 AM Dmitri Bourlatchkov wrote: > OpenAPI tool will WARN a lot if Operation IDs overlap. Generated code/html > may also look odd in case of ov

Re: [DISCUSS] Variant Spec Location

2024-08-15 Thread Russell Spitzer
t; >> On Thu, Aug 15, 2024, at 23:17, Gang Wu wrote: >> >> +1 on posting this discussion to dev@spark ML >> >> > I don't think there is anything that would stop us from moving to a >> joint project in the future >> >> My concern is that if we don

[DISCUSS] Row Lineage Proposal

2024-08-16 Thread Russell Spitzer
Hi Y'all, We've been working on a new proposal to add Row Lineage to Iceberg in the V3 Spec. The general idea is to give every row a unique identifier as well as a marker of what version of the row it is. This should let us build a variety of features related to CDC, Incremental Processing and Aud

Re: [DISCUSS] Row Lineage Proposal

2024-08-19 Thread Russell Spitzer
he new rows > which updated them during write, then on compaction we could update the > lineage fields based on this info. > > Is there any better ideas with Spark streaming which we can adopt? > > Thanks, > Peter > > [1] - https://paimon.apache.org/docs/0.8/ > > On S

Re: [VOTE] Spec changes in preparation for v3

2024-08-19 Thread Russell Spitzer
+1 - Feels duplicative to vote here and approve on the PR On Mon, Aug 19, 2024 at 2:41 PM Ryan Blue wrote: > Hi everyone, > > I'd like to vote on PR #10948 > , which has some spec > changes to prepare for v3: > > * Add a high-level v3 summary (only c

Re: [DISCUSS] Release source and binary verification

2024-08-20 Thread Russell Spitzer
I think these are reasonable to add, we probably should also verify there are no binaries of any kind in the release tarball. Sometimes builds accidentally leak these. On Tue, Aug 20, 2024 at 8:36 AM Piotr Findeisen wrote: > Hi All, > > Hi > > The release verification [1] includes testing releas

Re: Community sync

2024-08-20 Thread Russell Spitzer
Copied from Calendar Invite https://www.google.com/url?q=https://meet.google.com/ujy-njjo-vre&sa=D&source=calendar&ust=1724601702621697&usg=AOvVaw1Rd0-RlNwoXE-OIDTVtAGC Triweekly Iceberg meeting for anyone wanting to get involved in the Iceberg development, documentation, or hear about the ro

Re: [VOTE] REST Endpoint discovery

2024-08-20 Thread Russell Spitzer
+1 On Tue, Aug 20, 2024 at 2:32 PM Walaa Eldin Moustafa wrote: > +1 non-biding > > Thanks for driving this Eduard. > > On Tue, Aug 20, 2024 at 12:17 PM Daniel Weeks wrote: > >> +1 >> >> On Tue, Aug 20, 2024 at 11:19 AM Yufei Gu wrote: >> >>> +1 >>> >>> Yufei >>> >>> >>> On Tue, Aug 20, 2024 at

Re: [DISCUSS] SQL syntax extensions

2020-08-25 Thread Russell Spitzer
I think the moment we start touching catalyst we should be using Scala. If in the future there is a stored procedure api in Spark we can always go back to Java. On Tue, Aug 25, 2020, 4:59 PM Anton Okolnychyi wrote: > One more point we should clarify before implementing: where will the SQL > exte

Possible Data Loss with RemoveOrphanFilesAction

2020-09-11 Thread Russell Spitzer
Because the RemoveOrphanFilesAction uses Filesystem.list, the paths of files found in the file system can have an authority included in them based on the core-site.xml. This is determined when listing the files so the entries stored in the metadata tables do not necessarily have to match. URIs will

Re: Possible Data Loss with RemoveOrphanFilesAction

2020-09-15 Thread Russell Spitzer
rning. Would you like to open a PR for that? > > On Fri, Sep 11, 2020 at 3:45 PM Russell Spitzer > wrote: > >> Because the RemoveOrphanFilesAction uses Filesystem.list, the paths of >> files found in the file system can have an authority included in them based >> on the

Re: How format code in iceberg automatically?

2020-10-10 Thread Russell Spitzer
I load the checkstyle format into IntelliJ to do this since some of the rules cannot be applied automatically, you can find the style files in .baseline/checkstyle/checkstyle. If you load them into intelliJ with the checkstyle plugin it will automatically apply them if you use SaveActions and forma

Re: Welcoming Zheng Hu as a new committer

2020-10-10 Thread Russell Spitzer
Congratulations! On Sat, Oct 10, 2020 at 8:24 AM Jungtaek Lim wrote: > Congrats! > > 2020년 10월 10일 (토) 오후 3:56, Junjie Chen 님이 작성: > >> Congratulations! Thanks for your great contribution in Flink sink and >> source! >> >> On Sat, Oct 10, 2020 at 9:09 AM 张军 wrote: >> >>> >>> Congratulations >>>

Re: Welcoming Jingsong Lee as a new committer

2020-10-10 Thread Russell Spitzer
Congratulations! On Sat, Oct 10, 2020 at 8:24 AM Jungtaek Lim wrote: > Congrats! > > 2020년 10월 10일 (토) 오후 3:56, Junjie Chen 님이 작성: > >> Congratulations! Thanks for your great contribution in Flink sink and >> source! >> >> On Sat, Oct 10, 2020 at 9:10 AM 张军 wrote: >> >>> >>> Congratulations >

Re: [VOTE] Release Apache Iceberg 0.10.0 RC2

2020-10-30 Thread Russell Spitzer
+1 (non-binding) Downloaded and ran build with Java HotSpot(TM) 64-Bit Server VM 18.9 (build 11.0.7+8-LTS, mixed mode). All tests passed :) On Fri, Oct 30, 2020 at 4:05 PM Anton Okolnychyi wrote: > Here is the link to steps we normally use to validate a release candidate: > > https://lists.apach

Re: User defined properties

2020-11-17 Thread Russell Spitzer
Invite Link to ASF Slack https://join.slack.com/t/the-asf/shared_invite/zt-ei391t4y-47FbOeNfu8SVA8BmSrtm8A I believe the warning there is just to know it may have collisions in the future or may be locked down, but I think it's pretty safe to actually add your own properties there imho On Mon, N

Re: Could I participate into this email list

2020-12-01 Thread Russell Spitzer
Of course! There is a virtual meetup tomorrow as well I will send you an invite On Tue, Dec 1, 2020, 10:23 AM 赵德儒 wrote: > Hi Guys, Could I participate into this email list, I would like to join > this discussion in the future or others activities, thank you so much. > > > -- > Best, > > *Axl Zh

Re: Shall we start a regular community sync up?

2020-12-01 Thread Russell Spitzer
Invite sent On Tue, Dec 1, 2020 at 1:20 PM Wing Yew Poon wrote: > I'd like to attend the community syncs as well. Can you please send me an > invite? > Thanks, > Wing Yew Poon > > On Thu, Nov 19, 2020 at 9:25 PM Chitresh Kakwani < > chitreshkakw...@gmail.com> wrote: > >> Hi Ryan, >> >> Could you

Re: Shall we start a regular community sync up?

2020-12-01 Thread Russell Spitzer
Sent On Tue, Dec 1, 2020 at 1:43 PM Chen Song wrote: > Hey Ryan > > Could you please send me an invite list as well. We have been evaluating > Iceberg strategically in our company and very interested in this. > > Best, > Chen > > On Thu, Nov 19, 2020 at 7:51 AM Vivekanand Vellanki > wrote: > >>

  1   2   3   >