Re: [DISCUSS] v4 - One file commits

2025-06-02 Thread Steve Loughran
so this'll cut down on #of manifest files read, won't it? so improving query planning Does anyone have an estimate of what benefit this is likely to have in production deployments? On Thu, 29 May 2025 at 21:25, Ryan Blue wrote: > Hi everyone, > > Like Russell’s recent note, I’m starting a threa

Re: Spark 4.0/Iceberg Integration Merged – Spark 3.5 Merges Can Resume

2025-05-16 Thread Steve
Thanks Huaxin! Great work and looking forward to the spark 4.0 and iceberg! On Thu, May 15, 2025 at 15:26 Ryan Blue wrote: > I agree, thank you for working on this! It's great to have this merged. > > On Thu, May 15, 2025 at 7:47 AM Russell Spitzer > wrote: > >> Thanks for getting this in! >> >

Re: [VOTE] Clarify writer requirements in the spec to prevent orphan DVs

2025-05-14 Thread Steve Zhang
+1 (non-binding) Thanks, Steve Zhang > On May 14, 2025, at 8:52 AM, Anton Okolnychyi wrote: > > Hi all, > > I propose the following update to the spec to clarify that writers must > remove any deletion vector that applies to a data file when that data file is > remo

Re: [VOTE] Merge details about GZip metadata files to the spec.

2025-05-12 Thread Steve Zhang
+1 (non-binding) Thanks, Steve Zhang > On May 11, 2025, at 6:45 PM, Gang Wu wrote: > > +1 (non-binding)

Re: [VOTE] Small spec change for default values

2025-04-22 Thread Steve Zhang
+1 (non binding) Thanks, Steve Zhang > On Apr 22, 2025, at 1:41 PM, Prashant Singh wrote: > > +1 (non-binding) > > Best, > Prashant Singh > > On Tue, Apr 22, 2025 at 2:55 AM Eduard Tudenhöfner <mailto:etudenhoef...@apache.org>> wrote: >> +1 >

Re: [VOTE] Spec Update: Variant Field Lower/Upper Bounds

2025-04-18 Thread Steve Zhang
+1 (non-binding) Thanks, Steve Zhang > On Apr 18, 2025, at 1:29 PM, huaxin gao wrote: > > +1 (non-binding)

Re: [VOTE] Simplify multi-argument field-id(s) encoding

2025-04-18 Thread Steve Zhang
+1 (non-binding) Thanks, Steve Zhang > On Apr 18, 2025, at 12:53 AM, Prashant Singh wrote: > > +1 (non-binding)

Re: [VOTE] Row lineage required for v3

2025-04-01 Thread Steve Zhang
+1 (non binding) Thanks, Steve Zhang > On Mar 31, 2025, at 11:05 PM, Jean-Baptiste Onofré wrote: > > +1 (non binding)

Re: Table metadata swap not work for REST Catalog (#12134)

2025-03-28 Thread Steve Zhang
metadata.json have a different table UUI as the existing one. We appreciate any insights or suggestions you may have. Best, Steve Zhang > On Feb 10, 2025, at 4:47 PM, Steve Zhang > wrote: > > Thank you Russell and Ryan. > > Let me start to work on a new API to support force tab

Re: [VOTE] Minor simplifications for Geo Spec

2025-03-21 Thread Steve Zhang
+1 (non-binding) Thanks, Steve Zhang > On Mar 18, 2025, at 6:29 PM, Gang Wu wrote: > > +1 (non-binding)

Re: [Discuss] Apache Iceberg 1.9.0 release

2025-03-17 Thread Steve Loughran
Can I get this reviewed and merged; gives all hadoop filesystems with bulk delete calls the ability to issue bulk deletes up to their page sizes; off by default. Tested all the way through iceberg to AWS S3 london. https://github.com/apache/iceberg/pull/10233 On Mon, 17 Mar 2025 at 12:32, Yuya

Re: [VOTE] Release Apache Iceberg 1.8.1 RC1

2025-02-27 Thread Steve Zhang
+1 (non-binding) - Checked signature/SHA512 - Ran RAT license check - Ran tests on JDK17 Thanks, Steve Zhang > On Feb 27, 2025, at 11:45 AM, Daniel Weeks wrote: > > +1 (binding) > > Verified sigs/sums/license/build/test (Java 17) > > -Dan > > On Thu, Feb 27

Re: [VOTE] Java implementation notes around current-snapshot-id

2025-02-24 Thread Steve Zhang
+1 (nb) Thanks, Steve Zhang > On Feb 24, 2025, at 6:32 PM, Renjie Liu wrote: > > +1 > > On Tue, Feb 25, 2025 at 7:00 AM Szehon Ho <mailto:szehon.apa...@gmail.com>> wrote: >> +1 >> >> Thanks >> Szehon >> >> On Mon, Feb 2

Re: Remove deprecated table properties

2025-02-17 Thread Steve Zhang
Thanks Fokko for removing deprecated properties! Just want to highlight the worst case for tables with old configuration and not aware of this deprecation might experience silent behavior change. But considering this has been deprecated for past 3 years, here’s my +1. Thanks, Steve Zhang

Re: [Discuss] Print un-pretty metadata JSON files without whitespace

2025-02-17 Thread Steve Zhang
+1. Configure table property `write.metadata.compression-codec` to gzip is usually suggested to reduce metadata size but drop whitespace can still help here. Thanks, Steve Zhang > On Feb 17, 2025, at 8:32 AM, Fokko Driesprong wrote: > > Hey Ian, > > Thanks for raising th

Re: [VOTE] Add overwriteRequested to RegisterTableRequest in REST spec

2025-02-16 Thread Steve Zhang
Thanks everyone for participating and reviewing! The vote has passed with the following results: +1 binding votes: 8 (Russell,Steven,Fokko,Daniel,Ryan,Yufei,Szhon,Eduard) +1 non-binding votes: 4 (Huang-Hsiang,Anurag,Huaxin,myself) 0 votes: none -1 votes: none Thanks, Steve Zhang > On Feb

[VOTE] Add overwriteRequested to RegisterTableRequest in REST spec

2025-02-12 Thread Steve Zhang
ps://github.com/apache/iceberg/pull/12239 This vote will be open for at least 72 hours. [ ] +1 Add overwriteRequested to RegisterTableRequest in REST spec [ ] +0 [ ] -1 I have questions and/or concerns Thanks, Steve Zhang

Re: [VOTE] Release Apache Iceberg 1.8.0 RC0

2025-02-12 Thread Steve Zhang
+1 (non-binding) - Checked signature/SHA512 - Ran RAT check - Ran tests on JDK17 Thanks, Steve Zhang > On Feb 12, 2025, at 9:52 AM, Eduard Tudenhöfner > wrote: > > +1 (binding) > > Verified sigs/checksums/build/tests with JDK17 > > I also saw the same TestS3Fi

Re: [VOTE] Add RemoveSchemas update type to REST spec

2025-02-11 Thread Steve Zhang
+1 nb Thanks, Steve Zhang > On Feb 11, 2025, at 10:26 AM, Honah J. wrote: > > +1 > > On Tue, Feb 11, 2025 at 10:16 AM Christian Thiel <mailto:christian.t.b...@gmail.com>> wrote: >> +1 (non-binding) >> Thanks Gabor! >> >> On Tue,

Re: Table metadata swap not work for REST Catalog (#12134)

2025-02-10 Thread Steve Zhang
Thank you Russell and Ryan. Let me start to work on a new API to support force table registration in catalog. Thanks, Steve Zhang > On Feb 10, 2025, at 4:29 PM, rdb...@gmail.com wrote: > > Yeah, it sounds like a "register table force" is the right concept here. I >

Re: Table metadata swap not work for REST Catalog (#12134)

2025-02-10 Thread Steve Zhang
metadata swap on an existing table. However, it was suggested that use TableOperations.commit(base, new) is preferred to achieve atomicity. Thanks, Steve Zhang > On Feb 10, 2025, at 1:49 PM, Daniel Weeks wrote: > > Hey Steve, > > I think the issue here is that you're using the

Re: [VOTE] Simplify multi-arg table metadata

2025-02-10 Thread Steve Zhang
+1 (non-binding). Thanks, Steve Zhang > On Feb 9, 2025, at 1:01 AM, Fokko Driesprong wrote: > > (Second attempt, the cat <https://ibb.co/Wv4M2TDY> ran over the keyboard) > > Hey everyone, > > After the positive responses on the devlist > &l

Re: Welcome Huaxin Gao as a committer!

2025-02-07 Thread Steve Herbert
t; Congratulations Huaxin! Awesome! >>> >>> >>> On Thu, Feb 6, 2025 at 9:27 AM Yufei Gu wrote: >>> >>>> Congrats Huaxin! >>>> >>>> Yufei >>>> >>>> >>>> On Thu, Feb 6, 2025 at 9:09 AM Steve Zhang

Table metadata swap not work for REST Catalog (#12134)

2025-02-07 Thread Steve Zhang
/apache/iceberg/issues/12134, where it works on JDBC and in-memory catalogs, but not with RESTCatalog. Best Regards, Steve Zhang

Re: [VOTE] Add Geometry and Geography types for V3

2025-02-06 Thread Steve Zhang
+1 (non-binding), Thank you Szehon for driving this over 5+ months and looking forward to see in V3 spec. Thanks, Steve Zhang > On Feb 6, 2025, at 12:11 PM, Jia Yu wrote: > > +1 (non-binding) > > I can’t wait to see the impact it will have on the broader geospatial > c

Re: Welcome Huaxin Gao as a committer!

2025-02-06 Thread Steve Zhang
Congratulations Huaxin, well deserved! Thanks, Steve Zhang > On Feb 6, 2025, at 8:16 AM, Xingyuan Lin wrote: > > Congrats Huaxin! > > On Thu, Feb 6, 2025 at 11:11 AM Denny Lee <mailto:denny.g@gmail.com>> wrote: >> Congratulations Huaxin!!! >> >

Re: Very strange (AI generated) issues

2025-01-31 Thread Steve Loughran
What about extending the issue templates? Because of a growing problem with worthless LLM-generated issues, github MAY terminate any account doing this to our project [ ] I am a human being and am not creating AI generated issues. [ ] I accept that if I am posting AI-generated issues, my github ac

Re: Proposal: Parquet footer size in Iceberg metadata

2025-01-30 Thread Steve Loughran
Knowing the footer offset would be really useful if passed down to whatever is implementing the input stream, along with the actual file size. This can be used for prefetching the footer, as well as caching it (Azure ABFS, google GCS connectors): right now they guess that about 1MB is all they nee

Re: missing files in an Iceberg table

2025-01-30 Thread Steve Loughran
apache.org/jira/browse/HADOOP-14837 Otherwise, something to at least scan a table and initiate slow recovery could be useful. steve On Tue, 28 Jan 2025 at 15:16, Zach Dischner wrote: > Hi Wing, > > Thank you for bringing this up. We run into this all the time, > particularly when the

Re: Very strange (AI generated) issues

2025-01-29 Thread Steve Loughran
Are these issues being manually created? maybe add a new checkbox [ ] I am not participating in any AI training/experiment and if it turns out that I am -I agree to compensate developers for the time wasted. Or have something to specifically handle new posters., or at least automatically flag th

Re: [DISCUSS/VOTE] Add in ChangeLog Reserved Field IDs to Spec and Decrement Row Lineage Reserved IDs

2025-01-26 Thread Steve Zhang
+1 (non-binding) Thanks, Steve Zhang > On Jan 25, 2025, at 10:48 AM, huaxin gao wrote: > > +1 (non-binding)

Re: [Discuss][Vote] Spec Change - Add optional field added-rows to Snapshot for Row Lineage

2025-01-16 Thread Steve Zhang
Thank you Russell! +1 (non-binding) Thanks, Steve Zhang > On Jan 15, 2025, at 10:53 PM, huaxin gao wrote: > > +1 (non-binding)

Re: [VOTE] Document Snapshot Summary Optional Fields as Appendix in Spec

2025-01-14 Thread Steve Zhang
+1 non-binding Thanks, Steve Zhang > On Jan 14, 2025, at 1:14 PM, Kevin Liu wrote: > > +1 non-binding.

Re: There is no easy way to secure Iceberg data. How can we improve?

2025-01-03 Thread Steve Loughran
actually, there is a way for the catalog to return S3 objects without granting access to the entire bucket: aws presigning: https://docs.aws.amazon.com/AmazonS3/latest/userguide/using-presigned-url.html This offers time-bounded access to an object catalog will need to generate and return the pres

Re: There is no easy way to secure Iceberg data. How can we improve?

2025-01-02 Thread Steve Loughran
if the data is stored in S3 then if someone has unrestricted access to a single store containing all the data (default without S3 access grants, cloudera ranger extensions or some other access control mechanism to grant access to clients without sharing credentials) - then it's effectively impossib

Re: New committer: Scott Donnelly

2024-12-11 Thread Steve Zhang
Congratulations Scott! Thanks, Steve Zhang > On Dec 11, 2024, at 4:47 AM, Fokko Driesprong wrote: > > Congratulations Scott! >

Re: New committer: Matt Topol

2024-12-10 Thread Steve Zhang
Congrats Matt! Thanks, Steve Zhang > On Dec 10, 2024, at 7:24 AM, Gang Wu wrote: > > Congrats Matt!

Re: Storing catalog directly on object store

2024-12-06 Thread Steve Loughran
e issues? One that is probably quite expensive to test. Whoever implements this is going to be left trying to work around problems. This is the best delegated to the AWS S3Table team as they may actually get some support. -Steve * be really good if other people commented on there to make it clear

Re: [Discuss] Simplify tableExists API in HiveCatalog

2024-12-02 Thread Steve Zhang
plans to change other existing behaviors. I've addressed the feedback from reviewers and also added explicit tests coverage, PR is ready for another look in https://github.com/apache/iceberg/pull/11597. Thanks, Steve Zhang > On Nov 27, 2024, at 10:15 PM, Péter Váry wrote: > > +

Re: Storing catalog directly on object store

2024-11-27 Thread Steve Loughran
There's a PR up from amazon to add this to the s3a connector https://github.com/apache/hadoop/pull/7011 targeting a 3.4.2 release early next year, though they've not updated the PR as requested yet. 1. It doesn't give you the same semantics as posix create-no-overwrite call -you only get t

[Discuss] Simplify tableExists API in HiveCatalog

2024-11-21 Thread Steve Zhang
://github.com/apache/iceberg/pull/11597 [2]: https://github.com/apache/iceberg/blob/3badfe0c1fcf0c0adfc7aa4a10f0b50365c48cf9/open-api/rest-catalog-open-api.yaml#L1129-L1133 Best regards, Steve Zhang

Re: [DISCUSS] Deprecate embedded manifests

2024-11-21 Thread Steve Zhang
+1 to deprecate Thanks, Steve Zhang > On Nov 19, 2024, at 3:32 AM, Fokko Driesprong wrote: > > Hi everyone, > > I would like to propose to deprecate embedded manifests > <https://github.com/apache/iceberg/pull/11586>. This has been used before the > manifest

Re: [VOTE] Deprecate and remove last-column-id

2024-11-19 Thread Steve Zhang
+1 nb Thanks, Steve Zhang > On Nov 19, 2024, at 12:18 AM, Fokko Driesprong wrote: > > Hi everyone, > > Based on the positive feedback on the [DISCUSS] thread > <https://lists.apache.org/thread/jz5s7pm2bhbm87ft495d6yrsh3bqvtb9> and the > pull-request on GitHub

Re: [DISCUSS] Spark 3.3 support?

2024-11-13 Thread Steve Zhang
+1 to deprecating and removing it Thanks, Steve Zhang > On Nov 13, 2024, at 11:23 AM, Eduard Tudenhöfner > wrote: > > +1 to deprecating and removing it

Re: [ANNOUNCE] Apache Iceberg release 1.7.0

2024-11-11 Thread Steve Herbert
Great news on the 1.7 release! Thanks to everyone who contributed and thanks Russ for driving the release itself over the finish line! On Mon, Nov 11, 2024 at 8:22 AM Bryan Keller wrote: > A user discovered an issue with the Kafka Connect distribution as a result > of an Azure dependency update

Re: [VOTE] Deletion Vectors in V3

2024-10-31 Thread Steve Zhang
+1 (non-binding) Thanks, Steve Zhang > On Oct 31, 2024, at 3:41 PM, rdb...@gmail.com wrote: > > +1 > > Thanks, Anton! > > On Wed, Oct 30, 2024 at 11:58 PM Fokko Driesprong <mailto:fo...@apache.org>> wrote: >> +1 >> >> I had to read up a

Re: [DISCUSS] Remove iceberg-pig module ?

2024-10-17 Thread Steve Zhang
+1 Thanks, Steve Zhang > On Oct 17, 2024, at 11:16 PM, roryqi wrote: > > +1. > > Péter Váry mailto:peter.vary.apa...@gmail.com>> > 于2024年10月18日周五 13:44写道: >> +1 >> >> On Fri, Oct 18, 2024, 04:50 Manu Zhang > <mailto:owenzhang1...@gmail.com&

[PR] Re: Iceberg-arrow vectorized read bug

2024-10-02 Thread Lessard, Steve
g NULL for the missing column’s value. When you have a moment, please review the PR. Thanks for your help, Steve Lessard Teradata From: Lessard, Steve Date: Monday, August 12, 2024 at 11:59 AM To: dev@iceberg.apache.org , Amogh Jahagirdar <2am...@gmail.com> Subject: Re: Iceberg-arrow vec

Re: [DISCUSS] Drop Hive 2 support

2024-09-09 Thread Steve Zhang
+1 Thanks, Steve Zhang > On Sep 9, 2024, at 11:45 AM, Russell Spitzer > wrote: > > +1 > > On Mon, Sep 9, 2024 at 7:59 AM Eduard Tudenhöfner <mailto:etudenhoef...@apache.org>> wrote: >> +1 on deprecating Hive 2 in Iceberg 1.7 and removing it in 1.8 >

Re: [VOTE] Merge REST Spec Change To Add New Scan Planning APIs

2024-09-06 Thread Steve Zhang
, Steve Zhang > On Sep 4, 2024, at 9:53 AM, Chertara, Rahil > wrote: > > An endpoint fetchScanTasks was added in order for a client to get the > file-scan-tasks associated with a plan-task by providing a plan-task as input. >

Re: [DISCUSS] Variant Spec Location

2024-08-28 Thread Steve Loughran
t said, I do think it's the right place for it. Unless there is a large body of log4j committers who have different opinions. steve On Wed, 28 Aug 2024 at 20:07, Aihua Xu wrote: > As the discussions in the Spark community ( > https://lists.apache.org/thread/0k5oj3mn0049fcxoxm3g

Re: [Discuss] test logging is broken and Avro 1.12.0 upgraded slf4j-api dep to 2.x

2024-08-26 Thread Steve Zhang
I believe dependabot tried to upgrade self4j to 2.x in [1] but JB mentioned there's -1 on this upgrade, maybe he has more context. [1]https://github.com/apache/iceberg/pull/9688 Thanks, Steve Zhang > On Aug 24, 2024, at 7:37 PM, Steven Wu wrote: > > Hi, > > It seems

Re: clarification on changelog behavior for equality deletes

2024-08-22 Thread Steve Zhang
cannot be used together. Thanks, Steve Zhang > On Aug 22, 2024, at 8:50 AM, Steven Wu wrote: > > > It should emit changes for each snapshot in the requested range. > > Wing Yew has a good point here. +1 > > On Thu, Aug 22, 2024 at 8:46 AM Wing Yew Poon > wrote:

Re: clarification on changelog behavior for equality deletes

2024-08-21 Thread Steve Zhang
I agree that option (a) is what user expects for row level changes. I feel the added deletes in given snapshots provides a PK of DELETED entry, existing deletes are used to read together with data files to find DELETED value (V1b) and result of columns. Thanks, Steve Zhang > On Aug

Community sync

2024-08-20 Thread Lessard, Steve
Based on previous emails in this list I got the impression that a community sync is coming up, possibly tomorrow morning. Where can I fing the meeting information so that I may listen in? -Steve Lessard, Teradata

Re: [VOTE] Spec changes in preparation for v3

2024-08-19 Thread Steve Zhang
+1 (non-binding) Thanks, Steve Zhang > On Aug 19, 2024, at 1:47 PM, John Zhuge wrote: > > +1 (non-binding) > > On Mon, Aug 19, 2024 at 1:34 PM Yufei Gu <mailto:flyrain...@gmail.com>> wrote: >> +1 >> Yufei >> >> >> On Mon,

Re: [EXTERNAL] Re: Iceberg-arrow vectorized read bug

2024-08-16 Thread Lessard, Steve
closer to a workable solution. I built upon the two pull requests I shared in my previous email and came up with the solution inhttps://github.com/apache/iceberg/pull/10953. I annotated that pull request to point out areas I know are problematic. -Steve Lessard, Teradata. From: Eduard

Re: Welcome Péter, Amogh and Eduard to the Apache Iceberg PMC

2024-08-14 Thread Steve Loughran
congratulations all. On Tue, 13 Aug 2024 at 21:25, Russell Spitzer wrote: > Hi Y'all, > > It is my pleasure to let everyone know that the Iceberg PMC has voted to > have several talented individuals join us. > > So without further ado, please welcome Péter Váry, Amogh Jahagirdar and > Eduard Tud

Re: Welcome Péter, Amogh and Eduard to the Apache Iceberg PMC

2024-08-13 Thread Steve Zhang
Congrats everyone, well deserved! Thanks, Steve Zhang > On Aug 13, 2024, at 1:31 PM, Bill Zhang wrote: > > Congratulations everyone. > > On Tue, Aug 13, 2024 at 1:28 PM Szehon Ho <mailto:szehon.apa...@gmail.com>> wrote: >> Congratulations all, very well des

Re: [DISCUSS] Filesystem in PyIceberg

2024-08-13 Thread Steve Loughran
On Tue, 13 Aug 2024 at 03:50, Xuanwo wrote: > Hi, André > > Thanks a lot for starting this thread. > > List operations on storage services are expensive and slow. That's why > Iceberg is designed to store metadata in files and avoid using list > operations in FileIO. However, `orphan file removal

Re: Iceberg-arrow vectorized read bug

2024-08-12 Thread Lessard, Steve
ist to dream up a better solution and mentor me so that I may code up that solution and further explore it. From: Lessard, Steve Date: Thursday, August 1, 2024 at 4:38 PM To: dev@iceberg.apache.org , Amogh Jahagirdar <2am...@gmail.com> Cc: dev@iceberg.apache.org Subject: Re: [EXT

Re: [EXTERNAL] Re: Iceberg-arrow vectorized read bug

2024-08-01 Thread Lessard, Steve
Hi Amogh, Do you think you could have another look at this issue or point me to someone who might be able to help me identify to the root cause and the correct fix? From: Lessard, Steve Date: Monday, July 29, 2024 at 5:46 PM To: dev@iceberg.apache.org , Amogh Jahagirdar <2am...@gmail.

Re: [VOTE] Clarify "File System Tables" in the table spec

2024-08-01 Thread Steve Zhang
+1 (non-binding) Thanks, Steve Zhang > On Aug 1, 2024, at 2:25 PM, John Zhuge wrote: > > +1 (non-binding) > > On Thu, Aug 1, 2024 at 10:55 AM Amogh Jahagirdar <2am...@gmail.com > <mailto:2am...@gmail.com>> wrote: >> +1 (non-binding) >> &g

Re: [EXTERNAL] Re: Case-insensitive schemas

2024-07-31 Thread Lessard, Steve
Ryan, I also replied in the PR. In my testing I do not see any runtime failure when trying to use a case-sensitive schema in a case-insensitive way. -Steve Lessard, Teradata From: Ryan Blue Date: Wednesday, July 31, 2024 at 5:02 PM To: dev@iceberg.apache.org Subject: [EXTERNAL] Re: Case

Case-insensitive schemas

2024-07-31 Thread Lessard, Steve
d nothing. Is there some kind of configuration or metadata flag that gives a hint? -Steve Lessard, Teradata

Re: [DISCUSS] Deprecate HadoopTableOperations, move to tests in 2.0

2024-07-30 Thread Steve Loughran
On Thu, 18 Jul 2024 at 00:02, Ryan Blue wrote: > Hey everyone, > > There has been some recent discussion about improving > HadoopTableOperations and the catalog based on those tables, but we've > discouraged using file system only table (or "hadoop" tables) for years now > because of major proble

Re: [EXTERNAL] Re: Iceberg-arrow vectorized read bug

2024-07-29 Thread Lessard, Steve
Adding Amogh Jahagirdar to the To: line… From: Lessard, Steve Date: Monday, July 29, 2024 at 1:12 PM To: dev@iceberg.apache.org , steve.less...@teradata.com.invalid Cc: dev@iceberg.apache.org Subject: Re: [EXTERNAL] Re: Iceberg-arrow vectorized read bug Hi Amog, Did you get a chance to look

case-insensitivity for column names in PartitionSpec (issue 10668)

2024-07-29 Thread Lessard, Steve
you pointed out that my initial solution would create problems. I have since reworked my solution in a manner that should address issue 10668 as well as address the shortcoming you identified. Could you please have another look at my PR? -Steve Lessard, Teradata

Re: [EXTERNAL] Re: Iceberg-arrow vectorized read bug

2024-07-29 Thread Lessard, Steve
some solution I am missing? -Steve Lessard, Teradata From: Amogh Jahagirdar <2am...@gmail.com> Date: Wednesday, June 26, 2024 at 10:59 PM To: steve.less...@teradata.com.invalid Cc: dev@iceberg.apache.org Subject: [EXTERNAL] Re: Iceberg-arrow vectorized read bug You don't often ge

Re: [VOTE] Drop Java 8 support in Iceberg 1.7.0

2024-07-26 Thread Steve Zhang
+1 (non-binding) Thanks, Steve Zhang > On Jul 26, 2024, at 9:15 AM, Amogh Jahagirdar <2am...@gmail.com> wrote: > > +1 (non-binding)

Re: Java String to Expression Util?

2024-07-25 Thread Steve Zhang
/caf03aed926665014c22cc4a68902bf684f258f9/core/src/main/java/org/apache/iceberg/expressions/ExpressionParser.java#L262-L264 Thanks, Steve Zhang > On Jul 25, 2024, at 11:54 AM, Pucheng Yang wrote: > > Hi dev community, > > If I read the codebase correctly, there seems to be no utility for converting

Re: [ANNOUNCE] Welcoming new committers and PMC members

2024-07-23 Thread Steve Zhang
Congrats everyone! Thanks, Steve Zhang > On Jul 23, 2024, at 9:20 AM, Anton Okolnychyi wrote: > > Congrats everyone!

Re: Dropping JDK 8 support

2024-07-23 Thread Steve Zhang
+1 (non-binding) Thanks, Steve Zhang > On Jul 22, 2024, at 10:13 PM, Ajantha Bhat wrote: > > +1 (non-binding)

Re: [ANNOUNCE] Welcoming new committers and PMC members

2024-07-23 Thread Steve Zhang
Congrats everyone! Thanks, Steve Zhang > On Jul 23, 2024, at 9:20 AM, Anton Okolnychyi wrote: > > Congrats everyone!

Re: [VOTE] Release Apache Iceberg 1.6.0 RC1

2024-07-22 Thread Steve Zhang
+1 non-binding Checked signature, SHA512 and license, built and ran tests against java 17 Thanks, Steve Zhang > On Jul 22, 2024, at 3:29 PM, Jack Ye wrote: > > +1 (binding) > > Checked signature, checksum, license > Ran unit and integration tests with JDK17 > Ran ma

Re: [DISCUSS] DROP PARTITION in Spark

2024-07-17 Thread Steve Zhang
-iceberg-do-differently Thanks, Steve Zhang > On Jul 17, 2024, at 2:36 PM, Walaa Eldin Moustafa > wrote: > > Hi Jean, One use case is Hive to Iceberg migration, where DROP PARTITION does > not need to change to DELETE queries prior to the migration. > > That said, I am not in

Re: [DISCUSS] Enable the discussion tab for iceberg github repos

2024-07-11 Thread Steve Zhang
+1, looking forward to see it in action, would 3 months be a good evaluation window for rest of the iceberg repo? Thanks, Steve Zhang > On Jul 11, 2024, at 9:52 AM, Yufei Gu wrote: > > +1. It is a no-brainer to me given it is more search-engine friendly compared > to slack a

Re: Building with JDK 21

2024-07-11 Thread Steve Loughran
thing. Reflection is going to be needed to use the successor classes/methods, but because the UGI calls are common in hive and any other service which support different users in the same process, getting adoption of the new operations will be tricking. steve Java21 would be interesting; their new

Re: [Vote] Deprecate oauth tokens endpoint

2024-07-10 Thread Steve Zhang
+1 (non binding) Thanks, Steve Zhang > On Jul 10, 2024, at 7:31 AM, Renjie Liu wrote: > > +1 (non binding)

Re: [VOTE] Fix property names in REST spec for statistics / partition statistics

2024-07-10 Thread Steve Zhang
+1 (non binding) Thanks, Steve Zhang > On Jul 10, 2024, at 1:10 AM, Jean-Baptiste Onofré wrote: > > +1 (non binding)

Iceberg-arrow vectorized read bug

2024-06-27 Thread Lessard, Steve
e root cause of the issue I originally reported as issue 10275<https://github.com/apache/iceberg/issues/10275>? -Steve Lessard, Teradata

Iceberg-arrow vectorized read bug

2024-06-26 Thread Lessard, Steve
e root cause of the issue I originally reported as issue 10275<https://github.com/apache/iceberg/issues/10275>? -Steve Lessard, Teradata

Iceberg-arrow vectorized read bug

2024-06-26 Thread Lessard, Steve
e root cause of the issue I originally reported as issue 10275<https://github.com/apache/iceberg/issues/10275>? -Steve Lessard, Teradata

Re: Making the NDV property required for theta sketch blobs in Puffin

2024-06-22 Thread Steve Zhang
+1 for making the NDV property required in blob metadata Thanks, Steve Zhang > On Jun 21, 2024, at 2:54 PM, Amogh Jahagirdar <2am...@gmail.com> wrote: > > make the property required

Re: [Early Feedback] Variant and Subcolumnarization Support

2024-05-14 Thread Steve Zhang
Great ideas! Looking forward to the community focused proposal in details and how it can benefit iceberg! Thanks, Steve Zhang > On May 10, 2024, at 10:06 PM, Tyler Akidau > wrote: > > Subcolumnarization of variant columns allows query engines to efficiently > prune datasets

Re: New committer: Renjie Liu

2024-03-11 Thread Steve Zhang
Congrats Renjie! Thanks, Steve Zhang > On Mar 11, 2024, at 12:18 PM, Szehon Ho wrote: > > Congratulations! > > On Mon, Mar 11, 2024 at 12:43 PM Jack Ye <mailto:yezhao...@gmail.com>> wrote: >> Congratulations Renjie! >> >> Best, >> Jack Y

Re: New committer: Bryan Keller

2024-03-05 Thread Steve Zhang
Congrats Bryan, well deserved! Thanks, Steve Zhang > On Mar 5, 2024, at 9:44 AM, Szehon Ho wrote: > > Congratulations Bryan, well deserved, great work on Iceberg ! > > On Tue, Mar 5, 2024 at 8:14 AM Jack Ye <mailto:yezhao...@gmail.com>> wrote: >> Congrats Bry

Re: [VOTE] Release Apache PyIceberg 0.6.0rc1

2024-01-31 Thread Steve Zhang
of 0.6.0! Thanks, Steve Zhang > On Jan 31, 2024, at 8:33 AM, Pucheng Yang wrote: > > nvm, I was under the wrong impression it was released already. Thanks. > > On Wed, Jan 31, 2024 at 8:31 AM Pucheng Yang <mailto:py...@pinterest.com>> wrote: >> 0.6.0 has been re

Re: [DISCUSS] PyIceberg 0.6.0 release

2024-01-26 Thread Steve Zhang
I am really excited to see the both append and overwrite support are finally checked and in favor of seeing it in 0.6.0 release. It’s a big milestone worth celebrating! I am happy to help with partitioned write and sort order. Thanks, Steve Zhang > On Jan 26, 2024, at 5:22 AM, Fo

Re: [DISCUSS] Iceberg community summit

2024-01-16 Thread Steve Zhang
I am also happy to contribute here! Thanks, Steve Zhang > On Jan 16, 2024, at 4:17 PM, Bill Zhang wrote: > > Same here. We'd like to volunteer and help out with this summit. > > On Mon, Jan 15, 2024 at 4:46 AM Eduard Tudenhoefner <mailto:edu...@tabular.io>> w

JDBC namespace existence check #8340

2023-10-19 Thread Steve Zhang
in https://github.com/apache/iceberg/issues/8321 and https://github.com/apache/iceberg/issues/8832. Thanks, Steve Zhang

Re: [VOTE] Release Apache Iceberg 1.4.1 RC0

2023-10-19 Thread Steve Zhang
+1 (non-binding) - validated checksum and signature - checked license docs & ran RAT checks - ran build and tests using JDK17 (problem with TestS3RestSigner and ADLSFileIOTest related but I think it’s setup related) Thanks, Steve Zhang > On Oct 19, 2023, at 4:23 AM, Ajantha Bhat

Re: Welcome new committers and PMC!

2023-05-03 Thread Steve Zhang
Congrats everyone! Well deserved and great job! Thanks, Steve Zhang > On May 3, 2023, at 5:52 PM, Prashant Singh wrote: > > Congratulations, Amogh, Eduard, Szehon Well deserved ! > > On Wed, May 3, 2023 at 3:07 PM Steven Wu <mailto:stevenz...@gmail.com>>

Re: Support create table like for Iceberg table?

2023-04-25 Thread Steve Zhang
https://iceberg.apache.org/docs/latest/spark-ddl/#create-table <https://iceberg.apache.org/docs/latest/spark-ddl/#create-table> Thanks, Steve Zhang > On Apr 25, 2023, at 1:46 PM, Pucheng Yang wrote: > > Hi all, > > I wonder how folks in the community deal with the cases where y

Re: [DISCUSS] Dropping Spark 2.4 support

2023-04-17 Thread Steve Zhang
+1 for dropping Spark 2.4 support and we can clean up doc as well such as https://iceberg.apache.org/docs/latest/spark-queries/#spark-24 Thanks, Steve Zhang > On Apr 13, 2023, at 12:53 PM, Jack Ye wrote: > > +1 for dropping 2.4 support >

Re: Welcome new PMC members!

2023-04-12 Thread Steve Zhang
Congratulations everyone! Thanks, Steve Zhang > On Apr 11, 2023, at 9:46 PM, Eduard Tudenhoefner wrote: > > Congrats to everyone! > > On Wed, Apr 12, 2023 at 6:14 AM Ajantha Bhat <mailto:ajanthab...@gmail.com>> wrote: > Congratulations to all. > > On Wed

Re: [DISCUSS] Removing python_legacy from the repo

2023-02-17 Thread Steve Zhang
Thank you Fokko and Ryan for your great work to reach feature parity. pyiceberg is the way to go! Thanks, Steve Zhang > On Feb 17, 2023, at 8:29 AM, Ryan Blue wrote: > > +1 for removing it. And it's great to see the new one reaching feature parity! > > On Fri, Feb

Re: [VOTE] Release Apache PyIceberg 0.2.0

2022-12-06 Thread Steve Zhang
t; But I cannot seem to figure out what’s wrong here. Thanks, Steve Zhang > On Dec 6, 2022, at 3:36 PM, Ryan Blue wrote: > > Russell, we normally test with `make test`, which runs everything but the S3 > mock stuff since that runs in CI. That said, it would be great if we could

Re: [VOTE] Release Apache PyIceberg 0.1.0 RC2

2022-09-30 Thread Steve Zhang
Thank you Fokko, also forgot to update my vote to +1 given package version is clarified. Thank you for the great work! Steve Zhang > On Sep 30, 2022, at 8:02 AM, Driesprong, Fokko wrote: > > Hey Everyone, > > Thanks all for checking the release, and we can conclude the vot

Re: [VOTE] Release Apache PyIceberg 0.1.0 RC2

2022-09-25 Thread Steve Zhang
in local they are fine) Issues: - same version issue as Ryan pointed out Thanks, Steve Zhang > On Sep 25, 2022, at 10:37 AM, Ryan Blue wrote: > > +0 > > Looks great, except that the version isn’t correct: pyiceberg.__version__ > returns 0.1.0rc2 > > Passin

  1   2   >