Re: Iceberg Help

2023-03-14 Thread Driesprong, Fokko
Hi Sun,

Thanks for reaching out. It looks like you're Hive metastore is not running
or is not reachable. The Hive metastore acts as a catalog to keep track of
all the tables that you've created. You can spin up a metastore using docker
. It is recommended to
run it using Docker since it also requires an RDBMS as a backend, for
example, MySQL or Postgres. Another option is to use a REST catalog with
Iceberg. There is an example (including a Jupyter Notebook) available online
.

Let us know if this help!

Kind regards,
Fokko Driesprong


Op di 14 mrt 2023 om 06:03 schreef Sun Shine :


Op di 14 mrt 2023 om 06:03 schreef Sun Shine :

> Hello:
>
> I need some help with my pyspark config, as shown below. I don't know if
> there is a config issue or if I am missing jar files. Could someone please
> help to have a working pyspark config?
> I am using a standalone spark install on my server with jupyter lab. My
> goal is to create an Iceberg table using pyspark and then insert data into
> the Iceberg table. Once the config issue is resolved, and the Iceberg table
> is created, I can take it from there.
>
> Again, I would greatly appreciate any help you can give.
>
> *Jupyter Lab code:-*
>
> import pyspark
> from pyspark.sql import SparkSession
> from pyspark.sql.types import *
> from pyspark import SparkConf, SparkContext
>
>
> # Setup the Configuration
> sparkConf = pyspark.SparkConf()
>
> #sparkConf.set("spark.sql.extensions",
> "org.apache.iceberg.spark.extensions.IcebergsparkSessionExtensions")
> sparkConf.set("spark.sql.catalog.jay_catalog",
> "org.apache.iceberg.spark.SparkSessionCatalog")
> sparkConf.set("spark.sql.defaultCatalog ", "jay_catalog")
> sparkConf.set("spark.sql.catalog.jay_catalog.type", "hive")
> sparkConf.set("spark.jars", "/opt/spark/jars/hive-metastore-2.3.9.jar,
> /opt/spark/jars/spark-hive-thriftserver_2.12-3.2.0.jar,
> /opt/spark/jars/spark-hive_2.12-3.2.0.jar")
> sparkConf.set("spark.sql.hive.metastore.jars",
> "/opt/spark/jars/hive-metastore-2.3.9.jar,
> /opt/spark/jars/spark-hive-thriftserver_2.12-3.2.0.jar,
> /opt/spark/jars/spark-hive_2.12-3.2.0.jar")
> sparkConf.set("spark.jars.packages",
> "org.apache.iceberg:iceberg-spark-runtime-3.3_2.12:1.1.0")
> sparkConf.set("spark.sql.execution.pyarrow.enabled", "true")
> sparkConf.set("spark.sql.catalog.jay_catalog.uri",
> "thrift://localhost:9083")
> sparkConf.set("hive.metastore.uris", "thrift://localhost:9083")
> sparkConf.set("spark.sql.catalog.jay_catalog.warehouse",
> "/opt/spark/pysparkjay/spark-warehouse")
>
> spark2 = SparkSession.builder \
>   .appName("Iceberg App") \
>   .master("local[12]") \
>   .config(conf=sparkConf) \
>   .enableHiveSupport() \
>   .getOrCreate()
>
> print("Spark2 Running")
>
>
> # Creating Table in SpqrkSQL
> spark2.sql( \
> """CREATE TABLE IF NOT EXISTS jay_catalog.db.patient_ice \
> ( \
> P_id bigint, \
> P_gender string, \
> P_DOB timestamp, \
> P_race string\
> ) \
> USING iceberg \
> PARTITIONED BY (months(P_DOB))""" \
> );
>
> *Getting errors as shown below.*
>
> Py4JJavaError Traceback (most recent call
> last)
>  in 
>   1 # Creating Table in SpqrkSQL
> > 2 spark2.sql( \
>   3 """CREATE TABLE IF NOT EXISTS jay_catalog.db.patient_ice \
>   4 ( \
>   5 P_id bigint, \
>
> /opt/spark/python/pyspark/sql/session.py in sql(self, sqlQuery)
> 721 [Row(f1=1, f2='row1'), Row(f1=2, f2='row2'), Row(f1=3,
> f2='row3')]
> 722 """
> --> 723 return DataFrame(self._jsparkSession.sql(sqlQuery),
> self._wrapped)
> 724
> 725 def table(self, tableName):
>
> ~/.local/lib/python3.8/site-packages/py4j/java_gateway.py in
> __call__(self, *args)
>1302
>1303 answer = self.gateway_client.send_command(command)
> -> 1304 return_value = get_return_value(
>1305 answer, self.gateway_client, self.target_id, self.name
> )
>1306
>
> /opt/spark/python/pyspark/sql/utils.py in deco(*a, **kw)
> 109 def deco(*a, **kw):
> 110 try:
> --> 111 return f(*a, **kw)
> 112 except py4j.protocol.Py4JJavaError as e:
> 113 converted = convert_exception(e.java_exception)
>
> ~/.local/lib/python3.8/site-packages/py4j/protocol.py in
> get_return_value(answer, gateway_client, target_id, name)
> 324 value = OUTPUT_CONVERTER[type](answer[2:],
> gateway_client)
> 325 if answer[1] == REFERENCE_TYPE:
> --> 326 raise Py4JJavaError(
> 327 "An error occurred while calling {0}{1}{2}.\n".
> 328 format(target_id, ".", name), value)
>
> Py4JJavaError: An error occurred while calling o43.sql.
> : org.apache.iceberg.hive.RuntimeMetaException: Failed to connect to Hive
> Metastore
> at org.apache.iceberg.hive.HiveClientPool

Re: [VOTE] Release Apache Iceberg 1.2.0 RC1

2023-03-19 Thread Driesprong, Fokko
+1 (non-binding)

* Validated signature and checksum
* Ran license check
* Built using Java 8
* Checked using the docker-spark-iceberg setup, and all the example
notebooks ran fine
.

Cheers, Fokko



Op za 18 mrt 2023 om 21:09 schreef Ryan Blue :

> +1 (binding)
>
> * Validated signature and checksum
> * Ran license check
> * Built using Java 8
> * Tested REST catalog using SigV4 and OAuth2
> * Tested remote signing
> * Tested lazy snapshot loading
> * Ran some basic read/write tests in Spark 3.3
>
> On Fri, Mar 17, 2023 at 5:15 PM Prashant Singh 
> wrote:
>
>> +1 (non-binding)
>>
>> 1. Verified checksum and signature
>>
>> 2. Verified license docs and ran RAT checks
>>
>> 3. Verified build and tests with JDK11
>>
>> 4. did manual testing with spark
>>
>>
>> Regards,
>>
>> Prashant Singh
>>
>> On Fri, Mar 17, 2023 at 5:04 PM Daniel Weeks  wrote:
>>
>>> +1 (binding)
>>>
>>> verified license/sigs/sums/build/test
>>>
>>> Also verified sigv4 and snapshot ref-only loading.
>>>
>>> Ran with Jdk17
>>>
>>> -Dan
>>>
>>> On Thu, Mar 16, 2023 at 10:46 PM Jahagirdar, Amogh
>>>  wrote:
>>>
 +1 (non-binding)

 1. Verified checksum and signature

 2. Verified license docs and ran RAT checks

 3. Verified build and all tests passed with JDK11

 4. Ran AWS integration tests



 Thanks,



 Amogh Jahagirdar



 *From: *Ajantha Bhat 
 *Reply-To: *"dev@iceberg.apache.org" 
 *Date: *Thursday, March 16, 2023 at 9:58 PM
 *To: *"dev@iceberg.apache.org" 
 *Subject: *RE: [EXTERNAL][VOTE] Release Apache Iceberg 1.2.0 RC1



 *CAUTION*: This email originated from outside of the organization. Do
 not click links or open attachments unless you can confirm the sender and
 know the content is safe.



 +1 (non-binding)

- verified Nessie integration testing with Spark-3.3_2.12_runtime
jar.
- validated checksum and signature
- checked license docs & ran RAT checks
- verified build with JDK11

 Thanks,
 Ajantha



 On Thu, Mar 16, 2023 at 4:31 AM Szehon Ho 
 wrote:

 Hi,



 One note, on this release, I ran some simple spark-SQL using a local
 Spark, like  "insert into table select 1".  I find any of these operation
 now spawns 200 executors and takes awhile to finish.


 |== Physical Plan ==\nAppendData
 org.apache.spark.sql.execution.datasources.v2.DataSourceV2Strategy$$Lambda$4700/0x000801b1b040@2934b897,
 IcebergWrite(table=iceberg.szho.test, format=PARQUET)\n+- AdaptiveSparkPlan
 isFinalPlan=false\n   +- Exchange hashpartitioning(a#413, 200),
 REPARTITION_BY_NUM, [id=#363]\n  +- Project [1 AS id#412, b AS a#413]\n
 +- Scan OneRowRelation[]\n\n|



 I think its expected, due to the distribution mode default change,
 which penalizes smaller jobs.  I think it'd be nice to have some doc
 guidances for more pleasant user experience for new users?  Maybe a note in
 the getting-started guide on how to reduce number of executors/ or turn off
 the distribution mode.



 That being said, I'm +1 (non-binding), aside from that.

- Verified signature
- Verified checkstum
- Rat check license
- Ran build and test (some aws test failed to create embedded jetty
server because of keystore, probably local environment error)
- Ran simple operations on Spark



 Thanks

 Szehon



 On Wed, Mar 15, 2023 at 8:54 AM Eduard Tudenhoefner 
 wrote:

 +1 (non-binding)

 · validated checksum and signature

 · checked license docs & ran RAT checks

 · ran build and tests with JDK11

 · integrated into Trino
  / Presto
  and our internal
 platform

 · ran a few manual steps in Spark 3.3



 Just FYI that the release notes will usually be available once voting
 on the RC passed and artifacts are publicly available.



 Thanks

 Eduard



 On Tue, Mar 14, 2023 at 5:19 AM Jack Ye  wrote:

 Hi Everyone,

 I propose that we release the following RC as the official Apache
 Iceberg 1.2.0 release.

 The commit ID is e340ad5be04e902398c576f431810c3dfa4fe717
 * This corresponds to the tag: apache-iceberg-1.2.0-rc1
 * https://github.com/apache/iceberg/commits/apache-iceberg-1.2.0-rc1
 *
 https://github.com/apache/iceberg/tree/e340ad5be04e902398c576f431810c3dfa4fe717

 The release tarball, signature, and checksums are here:
 *
 https://dist.apache.org/repos/dist/dev/iceberg/apache

Re: C++/Rust SDK sync

2023-04-07 Thread Driesprong, Fokko
Hi Jan,

Thanks for raising this, and I'd love to join the sync. I did quite a bit
of work on the Python implementation, and I'm happy to help with the
Rust/C++ SDK as well. I'm neither a Rust nor C++ programmer (did some C++
in the past), but happy to help with the implementation by providing
context. For me, all of the abovementioned slots work.

Kind regards,
Fokko Driesprong

Op vr 7 apr 2023 om 14:10 schreef Jan Kaul :

> Hi iceberg community,
>
> Like discussed in the last Iceberg Sync, it would be great to have
> another meeting to discuss how to combine our efforts for a C++ and/or
> Rust SDK.
>
> Here are three possible dates for the Sync:
>
> 1. 18.04.23 16:00 UTC
>
> 2. 19.04.23 16:00 UTC
>
> 3. 20.04.23 16:00 UTC
>
> For those who want to join the meeting, it would be great if you could
> answer this email with the dates that you are available. I will then
> create an online meeting for the date where most people can join.
>
> I'm looking forward to talking to you.
>
> Best wishes,
>
> Jan Kaul
>
>


Re: [DISCUSS] Dropping Spark 2.4 support

2023-04-20 Thread Driesprong, Fokko
Thanks all for the response, much appreciated.

That said, I'd love to hear from more people on this. I think it would be
> great to drop support, but I don't know how many people still use it. Is
> upgrading Hadoop a good reason to drop support for an engine? Hadoop seems
> like a minor concern to me unless it is blocking something.


I noticed that we needed to bump Hadoop when we wanted to upgrade to
Parquet 1.13.0 . It would be
nice to get this in since it allows for removing a workaround from the
Iceberg codebase (see PR for details).

Netflix is still on Spark-2.4.4 with Iceberg-0.9. We are actively migrating
> to Spark-3.x and Iceberg 1.1 (or later). I do not anticipate us
> using Spark-2.4.4 with newer versions of Iceberg (>0.9).


For Spark 2.4 Iceberg up to 1.2.1 is available:
https://mvnrepository.com/artifact/org.apache.iceberg/iceberg-spark-2.4

As for the Hadoop upgrade, I think that could be problematic for us if
> there's any non-backwards compatible API change required at compile time
> since we're still running a 2.8.x version.


Thanks for raising this. I took some time today to dig into this. There is
an effort to upgrade Hadoop  in
Iceberg, but that's stuck on incompatibilities with Tez. Unfortunately, Parquet
1.13.0

doesn't
compile against Hadoop 2.8.5 and also bringing back support Hadoop 2.8.x is
going to be hard . For
Parquet, I've created a PR to run the CI against Hadoop 2.9.2
 so we know when we're
breaking compatibility.

TLDR: It looks like if we want to upgrade Parquet, and other libraries in
the future, we need to drop Hadoop 2. I'm hesitant to do that right now
because we might exclude users that are still on older versions of Hadoop
(such as Airbnb). Spark has announced that Spark 3.5 Hadoop 2 will be
dropped .
I'll create a PR for removing Spark 2.4 shortly because I see a consensus
for removing that.

Kind regards,
Fokko

Op wo 19 apr 2023 om 19:02 schreef Anton Okolnychyi
:

> Yes, yes, yes!
>
> - Anton
>
> On Apr 19, 2023, at 8:17 AM, Ryan Blue  wrote:
>
> Sounds like we have consensus for removing Spark 2.4.
>
> Thanks, everyone!
>
> On Wed, Apr 19, 2023 at 12:36 AM Ajantha Bhat 
> wrote:
>
>> +1,
>> Spark-2.4 has reached EOL (
>> https://lists.apache.org/thread/tdk7r5gx3nwrds3fg7qmp5h2jnqgc6tb and
>> https://spark.apache.org/versioning-policy.html)
>>
>> Thanks,
>> Ajantha
>>
>> On Wed, Apr 19, 2023 at 3:52 AM Edgar Rodriguez <
>> edgar.rodrig...@airbnb.com.invalid> wrote:
>>
>>> I'm generally +1 on dropping Spark 2.4 - mostly everyone is moving to
>>> Spark 3.x, if not already moved.
>>>
>>> As for the Hadoop upgrade, I think that could be problematic for us if
>>> there's any non-backwards compatible API change required at compile time
>>> since we're still running a 2.8.x version.
>>>
>>> Cheers,
>>>
>>> On Mon, Apr 17, 2023 at 3:50 PM Steve Zhang <
>>> hongyue_zh...@apple.com.invalid> wrote:
>>>
 +1 for dropping Spark 2.4 support and we can clean up doc as well such
 as https://iceberg.apache.org/docs/latest/spark-queries/#spark-24

 Thanks,
 Steve Zhang



 On Apr 13, 2023, at 12:53 PM, Jack Ye  wrote:

 +1 for dropping 2.4 support



>>>
>>> --
>>> Edgar R
>>> Data Warehouse Infrastructure
>>>
>>
>
> --
> Ryan Blue
> Tabular
>
>
>


Re: Java 1.3.0 around mid May?

2023-04-28 Thread Driesprong, Fokko
Hey Anton,

I think that's a great idea. Would be great to get Spark 3.4 and Flink 1.7
support out. I saw some questions on the Apache Slack today and also
noticed that Spark supports timestamps without timezone from Spark 3.4
onward and went ahead and created an issue for that
.

We might also want to add the Hadoop and Parquet upgrades as a part of this.


Do we have a consensus on upgrading Hadoop? Hadoop in Iceberg is still at
2.7.x, Parquet 1.13.0 requires Hadoop 3.x+
, I've backported
support for Hadoop 2.9.x 
which will be part of the patch release 1.13.1. Today at the Parquet sync I
volunteered as a release manager to do the patch release. This would also
come with running Flink without Hadoop, which would be awesome.

I'm all in for doing another release in May, so we can get this in as well.
WDYT?

Kind regards,
Fokko Driesprong




Op vr 28 apr 2023 om 19:54 schreef Jack Ye :

> Looking at the milestone https://github.com/apache/iceberg/milestone/26,
> most of the PRs are in good progress except for
> https://github.com/apache/iceberg/issues/7449. Given the fact that many
> people are looking forward to using Spark 3.4 and Flink 1.17, it's probably
> worth having a quick release.
>
> We might also want to add the Hadoop and Parquet upgrades as a part of
> this.
>
> -Jack
>
>
>
> On Thu, Apr 27, 2023 at 6:35 PM Anton Okolnychyi
>  wrote:
>
>> We briefly discussed the idea of releasing 1.3.0 around mid May during
>> the sync. The primary goal is to support Spark 3.4 and Flink 1.17. There
>> are also a few other notable changes that went in or are very close. Do we
>> see any blockers that we should track?
>>
>> - Anton
>
>


Re: [VOTE] Release PyIceberg 0.4.0 RC1

2023-06-27 Thread Driesprong, Fokko
Thank you Dan, great catch!

I've pinned  the version of
pyparsing. It looks like it switched to non-greedy matching. The first
commit updates the tests, which exposes the issue, and the second commit
locks pyparsing to <3.1.0. It looks like the pyparsing library is rapidly
evolving, and the best thing for now is to just pin it to a range that
works.

Let's cancel this vote, and once the PR is merged, I'll cut the next RC.

Kind regards, Fokko

Op di 27 jun 2023 om 21:14 schreef Daniel Weeks :

> I ran into an issue with the row filtering:
>
> t.scan(row_filter="location_id > 1").to_pandas()
>
> File
> ~/workspace/apache/releases/pyiceberg/0.4.0-rc1/pyiceberg-0.4.0/pyiceberg/schema.py:183,
> in Schema.find_field(self, name_or_id, case_sensitive)
> 180 field_id = self._lazy_name_to_id_lower.get(name_or_id.lower())
> 182 if field_id is None:
> --> 183 raise ValueError(f"Could not find field with name
> {name_or_id}, case_sensitive={case_sensitive}")
> 185 return self._lazy_id_to_field[field_id]
>
> ValueError: Could not find field with name l, case_sensitive=True
>
> I shared this with Fokko.
>
> -Dan
>
>
> On Mon, Jun 26, 2023 at 9:58 PM Jean-Baptiste Onofré 
> wrote:
>
>> +1 (non binding)
>>
>> Regards
>> JB
>>
>> On Mon, Jun 26, 2023 at 11:27 AM Fokko Driesprong 
>> wrote:
>> >
>> > Hi Everyone,
>> >
>> >
>> > Excited to start the 0.4.0 PyIceberg release process. The 0.4.0 release
>> is packed with cool features:
>> >
>> > Support for converting Parquet schemas into Iceberg ones
>> > Support for reading data using FSSpec.
>> > Support fetching a limited number of rows to quickly peek into a
>> dataset.
>> > Reduced the number of calls to the object store with PyArrow>=12.0.0.
>> > Speed up queries using the Iceberg metrics.
>> > Ability to do SQL style filters: row_filter='passengers >= 3'.|
>> > SigV4 support for the REST catalog.
>> > A complete makeover of the docs site.
>> > Support for positional deletes.
>> > Ability to set table properties.
>> > And many bugs have been fixed!
>> >
>> >  I propose that we release the following RC as the official PyIceberg
>> 0.4.0 release. The commit ID is e85ec9447c08c1a21e9ef21278f3237811f3f67f
>> >
>> >
>> > * This corresponds to the tag: pyiceberg-0.4.0rc1
>> (c3579a11b4bfa5387e313185e714c40a0ed1ccfe)
>> >
>> > * https://github.com/apache/iceberg/releases/tag/pyiceberg-0.4.0rc1
>> >
>> > *
>> https://github.com/apache/iceberg/tree/e85ec9447c08c1a21e9ef21278f3237811f3f67f
>> >
>> >
>> > The release tarball, signature, and checksums are here:
>> >
>> >
>> > * https://dist.apache.org/repos/dist/dev/iceberg/pyiceberg-0.4.0rc1/
>> >
>> >
>> > You can find the KEYS file here:
>> >
>> >
>> > * https://dist.apache.org/repos/dist/dev/iceberg/KEYS
>> >
>> >
>> > Convenience binary artifacts are staged on pypi:
>> >
>> >
>> > https://pypi.org/project/pyiceberg/0.4.0rc1/
>> >
>> >
>> > And can be installed using: pip3 install pyiceberg==0.4.0rc1
>> >
>> >
>> > Please download, verify, and test.
>> >
>> >
>> > Please vote in the next 72 hours.
>> >
>> > [ ] +1 Release this as PyIceberg 0.4.0
>> >
>> > [ ] +0
>> >
>> > [ ] -1 Do not release this because...
>> >
>> >
>> > Please consider this email a +1 from my side:
>> >
>> > Ran some basic table scans
>> >
>> > Including tables with positional deletes
>> >
>> > Checked to see if everything still works when PyArrow is not installed
>> > Set some table properties
>> >
>> > Kind regards,
>> >
>> > Fokko
>>
>


Re: [VOTE] Release PyIceberg 0.4.0 RC2

2023-06-27 Thread Driesprong, Fokko
Sorry about that. Looks like I didn't properly clean up my dist/ folder.
I've updated the SVN location. I'll update the how-to-release guide to make
sure that this won't happen again.

Kind regards,
Fokko

Op di 27 jun 2023 om 23:04 schreef Ryan Blue :

> Any idea why there are rc1 artifacts here?
> https://dist.apache.org/repos/dist/dev/iceberg/pyiceberg-0.4.0rc2/
>
> On Tue, Jun 27, 2023 at 1:37 PM Fokko Driesprong  wrote:
>
>> All,
>>
>>
>> Excited to start the 0.4.0 PyIceberg release process. The 0.4.0 release
>> is packed with awesome features:
>>
>>- Support for converting Parquet schemas into Iceberg ones
>>
>>- Support for reading data using FSSpec
>>.
>>- Support fetching a limited number of rows
>> to quickly peek into a
>>dataset.
>>- Reduced the number of calls
>> to the object store
>>with PyArrow>=12.0.0.
>>- Speed up queries using the Iceberg metrics.
>>
>>- Ability to do SQL style filters
>>: row_filter='passengers
>>>= 3'.|
>>- SigV4 support  for the
>>REST catalog.
>>- A complete makeover  of
>>the docs site.
>>- Support for positional deletes
>>.
>>- Ability to set table properties
>>.
>>- And many bugs have been fixed
>>
>> 
>>!
>>
>> I propose that we release the following RC as the official PyIceberg
>> 0.4.0 release. The commit ID is 51eaf6806361e6e0a5cd163071dce684ec05350b
>>
>>
>> * This corresponds to the tag: pyiceberg-0.4.0rc2 (
>> f81c759835672e956c71280394f432463d25463c)
>>
>> * https://github.com/apache/iceberg/releases/tag/pyiceberg-0.4.0rc2
>>
>> *
>> https://github.com/apache/iceberg/tree/51eaf6806361e6e0a5cd163071dce684ec05350b
>>
>>
>> The release tarball, signature, and checksums are here:
>>
>>
>> * https://dist.apache.org/repos/dist/dev/iceberg/pyiceberg-0.4.0rc2/
>>
>>
>> You can find the KEYS file here:
>>
>>
>> * https://dist.apache.org/repos/dist/dev/iceberg/KEYS
>>
>>
>> Convenience binary artifacts are staged on pypi:
>>
>>
>> https://pypi.org/project/pyiceberg/0.4.0rc2/
>>
>>
>> And can be installed using: pip3 install pyiceberg==0.4.0rc2
>>
>>
>> Please download, verify, and test.
>>
>>
>> Please vote in the next 72 hours.
>>
>> [ ] +1 Release this as PyIceberg 0.4.0
>>
>> [ ] +0
>>
>> [ ] -1 Do not release this because...
>>
>>
>> Please consider this email a +1 from my side:
>>
>>
>>- Ran some basic table scans
>>   - Including tables with positional deletes
>>- Checked to see if everything still works when PyArrow is not
>>installed
>>- Set some table properties
>>
>> Kind regards,
>>
>> Fokko
>>
>
>
> --
> Ryan Blue
> Tabular
>


Re: [VOTE] Release Apache PyIceberg 0.5.0

2023-09-12 Thread Driesprong, Fokko
Hey everyone,

After an issue on Github , I
noticed a bug in PyIceberg that the filesystem isn't being reused
. I think there is more room
for improvement (both in the long and short term), but I don't think we
should block the release on that since 0.5.0 is already much faster due to
improved Avro parsing, improved IO, and the previously mentioned bugfix
(and one that was merged earlier today
).

I'll cut another PR as soon as #8549 is in. Thanks everyone for the
patience!

Cheers, Fokko

Op ma 11 sep 2023 om 14:22 schreef Fokko Driesprong :

> Hi Everyone,
>
> I propose that we release the following RC as the official PyIceberg 0.5.0
> release. A summary of what's included in 0.5.0:
>
>- Add gzip metadata support
>
>- PyArrow HDFS support 
>- Support serverless environments (AWS Lambda)
>
>- Many fixes around Avro performance (PRs 1
>, 2
>, 3
>, 4
>)
>- Remove the upper bound of PyParsing dependency
> (blocking a PR in Airflow
>)
>- Moving the reading of Avro to Cython
> (10x speed
>improvement(!))
>- Support for the SQLCatalog
> (JDBC in Java)
>- Fix support for UUID columns
>
>- Support for adding columns
>
>- Optimize concurrency  
> (follow
>up on the Support servless environments)
>- Bump Pydantic to v2  
> (improved
>performance of the JSON (de)serialization)
>- A lot of bugfixes!
>
> The commit ID is 3323281045a72f1156d58c261067469e383fb26d
>
> * This corresponds to the tag: pyiceberg-0.5.0rc2
> (92600935834bdf77ba37ac361338712713549a77)
> * https://github.com/apache/iceberg/releases/tag/pyiceberg-0.5.0rc2
> *
> https://github.com/apache/iceberg/tree/3323281045a72f1156d58c261067469e383fb26d
>
> The release tarball, signature, and checksums are here:
>
> * https://dist.apache.org/repos/dist/dev/iceberg/pyiceberg-0.5.0rc2/
>
> You can find the KEYS file here:
>
> * https://dist.apache.org/repos/dist/dev/iceberg/KEYS
>
> Convenience binary artifacts are staged on pypi:
>
> https://pypi.org/project/pyiceberg/0.5.0rc2/
>
> And can be installed using: pip3 install pyiceberg==0.5.0rc2
>
> Since a lot has changed due to the release of the wheels (binary Python
> libraries), I've included the following steps to verify the release:
>
> curl https://dist.apache.org/repos/dist/dev/iceberg/KEYS -o KEYS
> gpg --import KEYS
>
> svn checkout
> https://dist.apache.org/repos/dist/dev/iceberg/pyiceberg-0.5.0rc1/
> /tmp/pyiceberg/
>
> for name in $(ls /tmp/pyiceberg/pyiceberg-*.whl
> /tmp/pyiceberg/pyiceberg-*.tar.gz)
> do
> gpg --verify ${name}.asc ${name}
> done
>
> cd  /tmp/pyiceberg/
> for name in $(ls /tmp/pyiceberg/pyiceberg-*.whl.asc.sha512
> /tmp/pyiceberg/pyiceberg-*.tar.gz.asc.sha512)
> do
> shasum -a 512 --check ${name}
> done
>
> tar xzf pyiceberg-0.5.0.tar.gz
> cd pyiceberg-0.5.0
>
> ./dev/check-license
>
> Please download, verify, and test.
>
> Please vote in the next 72 hours.
> [ ] +1 Release this as PyIceberg 0.5.0
> [ ] +0
> [ ] -1 Do not release this because...
>
> Please consider this my +1, I've checked against the docker-spark-iceberg
>  notebook,
> and did some checks.
>
> Kind regards,
> Fokko Driesprong
>
>


Re: [VOTE] Release Apache Iceberg 1.4.0 RC2

2023-10-01 Thread Driesprong, Fokko
+1 (binding)

Thanks Anton for running the release, and everyone for contributing!

   - Ran license checks
   - Validated signature and checksum
   - Ran notebooks against 1.4.0 with iceberg-aws-bundle
   
   - Tested against Trino ,
   and found three differences, but expected:
  - More defensive cleaning up of files on a failed commit, to make
  table recovery easier when needed.
  - A new property that's set on the table, indicating zstd compression.
  - Changes in the exceptions when binding a transform to a column type
  that is not allowed

Kind regards, Fokko


Op zo 1 okt 2023 om 22:43 schreef Ryan Blue :

> +1 (binding)
>
> - Ran license checks (dev/check-license)
> - Validated signature and checksum
> - Built and ran tests with Flink 1.17 and Spark 3.5 in Java 11
> - Ran queries in Spark 3.5 with the iceberg-aws-bundle providing S3
> dependencies
> - Checked CI tests are all passing
>
> On Sun, Oct 1, 2023 at 1:53 AM Ajantha Bhat  wrote:
>
>> +1 (non-binding)
>>
>> - Verified Nessie integration testing (API v2 and V1) with
>> Spark-3.3_2.12_runtime jar.
>> - Validated checksum and signature
>> - Checked license docs & ran RAT checks
>> - Verified build with JDK11
>>
>>
>> @Dan:
>> Flink test failure with Java 17 is tracked from
>> https://github.com/apache/iceberg/issues/8680
>> and it seems Flink doesn't officially support Java 17 in the current
>> Iceberg integrated versions.
>> So, we are good to go I guess.
>>
>> Thanks,
>> Ajantha
>>
>> On Sun, Oct 1, 2023 at 3:45 AM Daniel Weeks  wrote:
>>
>>> +1 (binding)
>>>
>>> Verified sigs/sums/license/build/test
>>>
>>> Using Java 17 I had failures in Flink tests (seems isolated to the Flink
>>> test framework, so not a blocker):
>>> TestIcebergSourceFailover > testBoundedWithTaskManagerFailover FAILED
>>> java.lang.IllegalAccessError: class org.apache.flink.util.NetUtils
>>> (in unnamed module @0x37858383) cannot access class
>>> sun.net.util.IPAddressUtil (in module java.base) because module java.base
>>> does not export sun.net.util to unnamed module @0x37858383
>>>
>>> However, these passed when I switched to Java 8
>>>
>>> I also performed some manual validation using Spark 3.5.
>>>
>>> Looks good!
>>> -Dan
>>>
>>>
>>> On Sat, Sep 30, 2023 at 12:13 PM Hussein Awala  wrote:
>>>
 +1 (non-binding) I tested it with Spark 3.3, all looks good.

 On Sat, Sep 30, 2023 at 9:04 PM Bryan Keller  wrote:

> +1 (non-binding)
>
> I reran the TPC-DS benchmark with RC2, with the same setup as with
> RC1, and there were no warnings about decimal pushdown, so that appears
> resolved. The results were also a bit better at 4915 sec.
>
> -Bryan
>
> On Fri, Sep 29, 2023 at 10:37 PM Anton Okolnychyi <
> aokolnyc...@apache.org> wrote:
>
>> +1 (binding)
>>
>> Validated signature, checksum, local build + tests.
>>
>> - Anton
>>
>> On 2023/09/30 04:58:15 Jean-Baptiste Onofré wrote:
>> > +1 (non binding)
>> >
>> > As for RC1, I checked:
>> > - signature and hash are OK
>> > - ASF headers are there
>> > - source distribution doesn't contain binary
>> > - build is OK
>> >
>> > Thanks,
>> > Regards
>> > JB
>> >
>> > On Sat, Sep 30, 2023 at 1:25 AM Anton Okolnychyi
>> >  wrote:
>> > >
>> > > Hi Everyone,
>> > >
>> > > I propose that we release the following RC as the official Apache
>> Iceberg 1.4.0 release.
>> > >
>> > > The commit ID is 10367c380098c2e06a49521a33681ac7f6c64b2c
>> > > * This corresponds to the tag: apache-iceberg-1.4.0-rc2
>> > > *
>> https://github.com/apache/iceberg/commits/apache-iceberg-1.4.0-rc2
>> > > *
>> https://github.com/apache/iceberg/tree/10367c380098c2e06a49521a33681ac7f6c64b2c
>> > >
>> > > The release tarball, signature, and checksums are here:
>> > > *
>> https://dist.apache.org/repos/dist/dev/iceberg/apache-iceberg-1.4.0-rc2
>> > >
>> > > You can find the KEYS file here:
>> > > * https://dist.apache.org/repos/dist/dev/iceberg/KEYS
>> > >
>> > > Convenience binary artifacts are staged on Nexus. The Maven
>> repository URL is:
>> > > *
>> https://repository.apache.org/content/repositories/orgapacheiceberg-1146/
>> > >
>> > > Please download, verify, and test.
>> > >
>> > > Please vote in the next 72 hours. (Weekends excluded)
>> > >
>> > > [ ] +1 Release this as Apache Iceberg 1.4.0
>> > > [ ] +0
>> > > [ ] -1 Do not release this because...
>> > >
>> > > Only PMC members have binding votes, but other community members
>> are encouraged to cast non-binding votes. This vote will pass if there 
>> are
>> 3 binding +1 votes and more binding +1 votes than -1 votes.
>> > >
>> > > - Anton
>> > >
>> >
>>
>
>
> --
> R

Re: [PROPOSAL] Apache Iceberg 1.4.3 release

2023-11-19 Thread Driesprong, Fokko
Hey JB,

Late to the party here, but 1.4.3 sounds like a great idea. Let me know if
you need any help with any release steps.

Kind regards,
Fokko Driesprong

Op ma 20 nov 2023 om 08:16 schreef Jean-Baptiste Onofré :

> Hi
>
> As there's no objection, I will move forward and prepare the release to
> vote.
>
> I will keep you posted asap.
>
> Thanks,
> Regards
> JB
>
> On Wed, Nov 15, 2023 at 6:11 AM Jean-Baptiste Onofré 
> wrote:
> >
> > Hi guys,
> >
> > Avro 1.11.3 has been released, fixing CVE-2023-39410.
> > We already updated to Avro 1.11.3 on main.
> >
> > About CVE, we also already use guava 32.1.3, fixing CVE-2023-2976.
> >
> > As the Avro CVE is classified high (see
> > https://nvd.nist.gov/vuln/detail/CVE-2023-39410), I propose to bump to
> > Avro 1.11.3 on our 1.4.x branch and release Iceberg 1.4.3 including
> > this.
> >
> > Thoughts ?
> >
> > If there are no objections, I'm volunteer to drive this release.
> >
> > Thanks,
> > Regards
> > JB
>


Re: [PROPOSAL] Apache Iceberg 1.4.3 release

2023-11-19 Thread Driesprong, Fokko
I took the liberty and created a 1.4.3 milestone
<https://github.com/apache/iceberg/milestone/43> to track any issues that
we want to backport.

Kind regards,
Fokko Driesprong

Op ma 20 nov 2023 om 08:50 schreef Driesprong, Fokko :

> Hey JB,
>
> Late to the party here, but 1.4.3 sounds like a great idea. Let me know if
> you need any help with any release steps.
>
> Kind regards,
> Fokko Driesprong
>
> Op ma 20 nov 2023 om 08:16 schreef Jean-Baptiste Onofré :
>
>> Hi
>>
>> As there's no objection, I will move forward and prepare the release to
>> vote.
>>
>> I will keep you posted asap.
>>
>> Thanks,
>> Regards
>> JB
>>
>> On Wed, Nov 15, 2023 at 6:11 AM Jean-Baptiste Onofré 
>> wrote:
>> >
>> > Hi guys,
>> >
>> > Avro 1.11.3 has been released, fixing CVE-2023-39410.
>> > We already updated to Avro 1.11.3 on main.
>> >
>> > About CVE, we also already use guava 32.1.3, fixing CVE-2023-2976.
>> >
>> > As the Avro CVE is classified high (see
>> > https://nvd.nist.gov/vuln/detail/CVE-2023-39410), I propose to bump to
>> > Avro 1.11.3 on our 1.4.x branch and release Iceberg 1.4.3 including
>> > this.
>> >
>> > Thoughts ?
>> >
>> > If there are no objections, I'm volunteer to drive this release.
>> >
>> > Thanks,
>> > Regards
>> > JB
>>
>


Re: Apache Iceberg Slack Invitation Required

2023-11-20 Thread Driesprong, Fokko
Hey Trilok,

Thanks for reaching out. I'll make sure to update the Slack URL. Can you
check using:
 
https://join.slack.com/t/apache-iceberg/shared_invite/zt-27f22riz7-o8nCsl5Vbc_2h6~3DF6qlw


Kind regards,
Fokko

Op ma 20 nov 2023 om 19:06 schreef Trilok Tourani :

> Please send the apache iceberg slack channel invite. I am unable to follow
> through via https://iceberg.apache.org/community/#slack
>
>
> Regards
> Trilok Tourani
>


Re: [VOTE] Release Apache Iceberg 1.4.3 RC0

2023-12-21 Thread Driesprong, Fokko
+1 (Binding)

Thanks JB for running this release!

- Checked the signature and checksums
- Ran the license check
- Ran the tests locally
- Tested against Trino:
https://github.com/trinodb/trino/pull/20207

Kind regards,
Fokko

Op do 21 dec 2023 om 16:53 schreef Jean-Baptiste Onofré :

> Hi Everyone,
>
> I propose that we release the following RC as the official Apache
> Iceberg 1.4.3 release.
>
> The commit ID is 9a5d24fee239352021a9a73f6a4cad8ecf464f01
> * This corresponds to the tag: apache-iceberg-1.4.3-rc0
> * https://github.com/apache/iceberg/commits/apache-iceberg-1.4.3-rc0
> *
> https://github.com/apache/iceberg/tree/9a5d24fee239352021a9a73f6a4cad8ecf464f01
>
> The release tarball, signature, and checksums are here:
> * https://dist.apache.org/repos/dist/dev/iceberg/apache-iceberg-1.4.3-rc0
>
> You can find the KEYS file here:
> * https://dist.apache.org/repos/dist/dev/iceberg/KEYS
>
> Convenience binary artifacts are staged on Nexus. The Maven repository URL
> is:
> *
> https://repository.apache.org/content/repositories/orgapacheiceberg-1149/
>
> Please download, verify, and test.
>
> Please vote in the next 72 hours. (Weekends excluded)
>
> [ ] +1 Release this as Apache Iceberg 1.4.3
> [ ] +0
> [ ] -1 Do not release this because...
>
> Only PMC members have binding votes, but other community members are
> encouraged to cast
> non-binding votes. This vote will pass if there are 3 binding +1 votes
> and more binding
> +1 votes than -1 votes.
>


[ANNOUNCE] Release Apache Iceberg Rust 0.2.0

2024-02-20 Thread Driesprong, Fokko
Hi all,

The Apache Iceberg Rust community is pleased to announce that Apache
Iceberg Rust 0.2.0 has been released!

Iceberg is a data access layer that allows users to easily and efficiently
retrieve data from various storage services in a unified way.

This first release provides integration with the REST catalog and a lot of
scaffolding that's needed for reading the data.

Please refer to the change log for the complete list of changes:

https://github.com/apache/iceberg-rust/releases/tag/0.2.0/

Apache Iceberg Rust website: https://rust.iceberg.apache.org/

Download Links: https://crates.io/crates/iceberg

Iceberg Resources:

- Issue: https://github.com/apache/iceberg-rust/issues
- Mailing list: dev@iceberg.apache.org
- Or join the #rust Slack channel:
https://join.slack.com/t/apache-iceberg/shared_invite/zt-287g3akar-K9Oe_En5j1UL7Y_Ikpai3A

Another huge thanks for making this possible: Amogh Jahagirdar, Chengxu
Bian, Christian Daudt, Farooq Qaiser, JanKaul, Manu Zhang, Mark Grey,
Renjie Liu, Tyler Schauer, Xiaoyang Liu, Xuanwo, ZENOTME, barronw,
hiirrxnn, y0psolo, yi wang, zhjwpku and of course dependabot[bot] for
working on this first release!

Thanks
On behalf of Apache Iceberg Community


Re: [VOTE] Fix property names in REST spec for statistics / partition statistics

2024-07-09 Thread Driesprong, Fokko
+1 (binding)

Op wo 10 jul 2024 om 07:47 schreef Renjie Liu 

> +1 (non binding)
>
> On Wed, Jul 10, 2024 at 1:45 PM Daniel Weeks  wrote:
>
>> +1 (binding)
>>
>> On Tue, Jul 9, 2024, 8:35 PM Eduard Tudenhöfner 
>> wrote:
>>
>>> Hey everyone,
>>>
>>> I propose to fix the property names in the REST spec
>>>  for statistics /
>>> partition statistics so that they are properly aligned with the table
>>> spec
>>> 
>>>  and
>>> the implementation
>>> 
>>> .
>>>
>>> Please vote within the next 72 hours.
>>>
>>> Eduard
>>>
>>


Re: [Vote] Deprecate oauth tokens endpoint

2024-07-10 Thread Driesprong, Fokko
+1 (binding)

Op wo 10 jul 2024 om 10:14 schreef Jean-Baptiste Onofré :

> +1 (non binding)
>
> NB: a few comments in the PR should be "addressed", but it's OK.
>
> Regards
> JB
>
> On Mon, Jul 8, 2024 at 6:15 PM Robert Stupp  wrote:
> >
> > Hi Everyone,
> >
> > I propose that we merge PR to "Deprecate oauth/tokens endpoint".
> >
> > The background and overall plan is discussed on this mailing list [2]
> > and this google doc [3].
> >
> > Please vote in the next 72 hours.
> >
> > Robert
> >
> >
> >
> > [1] https://github.com/apache/iceberg/pull/10603
> >
> > [2] https://lists.apache.org/thread/twk84xx7v0xy5q5tfd9x5torgr82vv50 and
> > https://lists.apache.org/thread/wcm9ylm0nbwfrx65n8b1tpjrdhgvcx24 and
> > https://lists.apache.org/thread/qksh9j9d8h6nt6qrfl47bj76jthddb0p
> >
> > [3]
> >
> https://docs.google.com/document/d/1Xi5MRk8WdBWFC3N_eSmVcrLhk3yu5nJ9x_wC0ec6kVQ
> >
> > --
> > Robert Stupp
> > @snazy
> >
>


Re: [VOTE] Release Apache PyIceberg 0.7.0rc2

2024-07-30 Thread Driesprong, Fokko
Huge thanks Sung for running this, that's a long list of new features.

+1 (binding)

- Validated signatures/checksums/license
- Ran tests locally and identified two minor issues (#979
, #980
), but no correctness
issues
- Manually checked some of the Avro metadata to validate the
FastAppend/MergeAppend strategies

Kind regards,
Fokko

Op di 30 jul 2024 om 16:39 schreef Jack Ye :

> +1 (binding)
>
> - Verified signature, license, checksum
> - Ran build and tests (python 3.11)
> - Ran S3 and Glue integration and manual tests
>
> Best,
> Jack Ye
>
>
> On Tue, Jul 30, 2024 at 5:00 AM Mehul Batra 
> wrote:
>
>> +1 (Non-binding)
>>
>>- Validated signatures/checksums/license
>>- Ran tests (make test & make test-s3) in Python3.11.5
>>
>> Thanks, everyone for testing and voting.
>>
>> Warm regards,
>> Mehul Batra
>>
>> On Tue, Jul 30, 2024 at 1:47 PM Honah J.  wrote:
>>
>>> +1 (binding)
>>>
>>>- Validated signatures/checksums/license
>>>- Ran tests (make test-coverage) in Python3.11
>>>- Ran Glue integration tests
>>>
>>> Thank you Sung for running the release and thanks everyone for testing
>>> and voting.
>>>
>>> Best regards,
>>> Honah
>>>
>>> On Mon, Jul 29, 2024 at 5:36 PM André Luis Anastácio
>>>  wrote:
>>>
 +1 (non-binding)


- Validated signatures / checksums
- Checked license


- Ran some code examples in Python 3.12


 André Anastácio

 On Monday, July 29th, 2024 at 2:42 PM, Kevin Liu <
 kevin.jq@gmail.com> wrote:

 +1 (non-binding)
 Verified signatures/checksums/license. Ran unit and integration tests.
 Logs are attached to this email.

 Sidenote, the Pyiceberg website
  docs have not been
 updated, so I follow the GitHub docs
 
 instead.

 On Mon, Jul 29, 2024 at 8:19 AM Chinmay Bhat 
 wrote:

> Tested 0.7.0rc2.
>
> +1 (non-binding)
> - validated signatures & checksums
> - checked license - RAT checks passed
> - ran tests and test-coverage with Python 3.9
>
> Thank you everyone for the hard work!
>
> Best,
> Chinmay
>
> On Sat, Jul 27, 2024 at 3:39 PM Sung Yun  wrote:
>
>> Thank you Fokko for your help in setting the next steps for the
>> course of resolution.
>>
>> To clarify as a follow up to Fokko' suggestion: the PyPi release
>> under test for 0.7.0rc2 can now be found here:
>> https://pypi.org/project/pyiceberg/0.7.0rc2/
>>
>> We will leave this VOTE thread open for votes to decide on the next
>> steps for this release.
>>
>> Thank you very much, and sorry for the inconvenience caused due to
>> this issue!
>> Sung
>>
>> On Sat, Jul 27, 2024 at 5:00 AM Fokko Driesprong 
>> wrote:
>>
>>> Hey everyone,
>>>
>>> I just yanked the release from PyPi. I still encourage everyone to
>>> test out PyIceberg 0.7.0rc1 to check if everything works on their end 
>>> and
>>> give all the awesome new features a go.
>>>
>>> Since the release has been yanked, and releases are immutable in
>>> PyPi, there are two ways forward:
>>>
>>>1. If the vote passes for this RC, we can unyank the release
>>>2. If there are things found that need fixing, we can bump the
>>>version to 0.7.1
>>>
>>> Wish you all a great weekend,
>>>
>>> Kind regards,
>>> Fokko
>>>
>>> Kind regards,
>>> Fokko
>>>
>>> Op za 27 jul 2024 om 03:45 schreef Sung Yun :
>>>
 Hi ndrluis,

 Thank you VERY much for flagging this. I really appreciate you
 bringing this to our attention so quickly.

 This is the first time I'm running the release front to end, and I
 missed one small detail that led to this mishap.

 I will cancel this vote, and remove the artifact from PyPi before
 starting a new vote.

 Sung

 On Fri, Jul 26, 2024 at 9:02 PM  wrote:

> Hey Sung Yun,
>
> Thank you for starting the release.
>
> I was checking PyPI, and it looks like the release candidate was
> published as version 0.7.0 (
> https://pypi.org/project/pyiceberg/0.7.0/).
> On Friday, July 26th, 2024 at 7:35 PM, Sung Yun <
> sungwy...@gmail.com> wrote:
>
> Hi Everyone,
>
>
> I propose that we release the following RC as the official
> PyIceberg 0.7.0 release.
>
>
> This is a large release featuring many amazing contributions from
> the community, and here’s a summary of the features introduced

Re: [DISCUSS] Iceberg 1.6.1 release

2024-08-07 Thread Driesprong, Fokko
Hey Piotr,

The Avro release still has to be done. We have 1.12.0 which has
 been released, but that also
drops Java 8 support, so we can't backport it. We still have to run the
1.11.4 Avro release to backport the CVE fix.

Kind regards,
Fokko

Op wo 7 aug 2024 om 16:15 schreef Piotr Findeisen :

> Hi
>
> Thank you JB and Eduard for commenting!
>
> JB, which Avro version we would be updating to for the CVE fix?
>
> Best
> Piotr
>
>
> On Mon, 29 Jul 2024 at 13:36, Jean-Baptiste Onofré 
> wrote:
>
>> That's fair (and I agree), but as these coming Avro releases include
>> CVE fix, I think it's worth considering.
>>
>> Regards
>> JB
>>
>> On Mon, Jul 29, 2024 at 9:07 AM Eduard Tudenhöfner
>>  wrote:
>> >
>> > I don't think we should be including general dependency updates in a
>> patch release unless they are critical.
>> >
>> > On Mon, Jul 29, 2024 at 8:13 AM Jean-Baptiste Onofré 
>> wrote:
>> >>
>> >> Hi,
>> >>
>> >> It would be great to include the Avro update in 1.6.1 release.
>> >>
>> >> I agree for a maintenance release on 1.6.x, but I would like to
>> >> include a couple of updates.
>> >>
>> >> Happy to drive this release :)
>> >>
>> >> Thanks !
>> >> Regards
>> >> JB
>> >>
>> >> On Fri, Jul 26, 2024 at 6:19 PM Piotr Findeisen
>> >>  wrote:
>> >> >
>> >> > Hi,
>> >> >
>> >> > ParallelIterable memory limit PR [1] is backported to 1.6.x branch
>> [2].
>> >> >
>> >> > Are there any other bug fixes that should go into 1.6.1 release?
>> >> >
>> >> > Best,
>> >> > Piotr
>> >> >
>> >> >
>> >> > [1] https://github.com/apache/iceberg/pull/10691
>> >> > [2] https://github.com/apache/iceberg/pull/10787
>> >> >
>> >> >
>>
>


Re: [ANNOUNCE] Release Apache Iceberg Rust v0.3.0

2024-08-20 Thread Driesprong, Fokko
Thanks Xuanwo! This is exciting and a big milestone, thanks everyone for
working on this!

Kind regards,
Fokko

Op wo 21 aug 2024 om 07:49 schreef Renjie Liu :

> Thanks Xuanwo for driving this release! Thanks to all contributors!
>
> On Wed, Aug 21, 2024 at 1:06 PM Xuanwo  wrote:
>
>> Hi all,
>>
>> The Apache Iceberg Rust community is pleased to announce
>> that Apache Iceberg Rust v0.3.0 has been released!
>>
>> Iceberg is an open table format for analytic datasets, and
>> Iceberg Rust is the native Rust implementation of Iceberg.
>>
>> The notable changes since v0.3.0 include:
>>
>> - Added Sync + Send to Catalog trait.
>> - Implemented OAuth for catalog REST client.
>> - Added parquet writer and reader capabilities with support for data
>> projection.
>> - Introduced memory catalog and memory file IO support.
>> - Initialized SQL Catalog and established subproject pyiceberg_core.
>> - Added support for GCS storage and AWS session tokens.
>> - Implemented concurrent table scans and data file fetching.
>> - Enhanced predicate builders and expression evaluators.
>> - Added support for timestamp columns in row filters.
>>
>> Please refer to the change log for the complete list of changes:
>> https://github.com/apache/iceberg-rust/releases/tag/v0.3.0
>>
>> Apache Iceberg Rust website: https://rust.iceberg.apache.org/
>>
>> Download Links: https://rust.iceberg.apache.org/download
>>
>> Iceberg Resources:
>> - Issue: https://github.com/apache/iceberg-rust/issues
>> - Mailing list: dev@iceberg.apache.org
>>
>> Thanks
>>
>> Xuanwo On behalf of Apache Iceberg Community
>>
>> https://xuanwo.io/
>>
>


Re: Welcome new committer and PPMC member Ratandeep Ratti

2020-02-17 Thread Driesprong, Fokko
Well deserved Ratandeep!

Cheers, Fokko

Op ma 17 feb. 2020 om 20:43 schreef Sud :

> Congratulations Ratandeep!! Keep up the good work!
>
> On Mon, Feb 17, 2020 at 6:26 AM Anjali Norwood
>  wrote:
>
>> Congratulations Ratandeep!!
>>
>> regards.
>> Anjali.
>>
>> On Mon, Feb 17, 2020 at 12:19 AM Manish Malhotra <
>> manish.malhotra.w...@gmail.com> wrote:
>>
>>> Congratulations 🎉!!
>>>
>>> On Sun, Feb 16, 2020 at 8:37 PM RD  wrote:
>>>
 Thanks everyone!

 -Best,
 R.

 On Sun, Feb 16, 2020 at 7:39 PM David Christle
  wrote:

> Congrats!!!
>
>
>
> *From: *Jacques Nadeau 
> *Reply-To: *"dev@iceberg.apache.org" 
> *Date: *Sunday, February 16, 2020 at 7:20 PM
> *To: *Iceberg Dev List 
> *Subject: *Re: Welcome new committer and PPMC member Ratandeep Ratti
>
>
>
> Congrats!
>
>
>
> On Sun, Feb 16, 2020, 7:06 PM xiaokun ding 
> wrote:
>
> CONGRATULATIONS
>
>
>
> 李响  于2020年2月17日周一 上午11:05写道:
>
> CONGRATULATIONS!!!
>
>
>
> On Mon, Feb 17, 2020 at 9:50 AM Junjie Chen 
> wrote:
>
> Congratulations!
>
>
>
> On Mon, Feb 17, 2020 at 5:48 AM Ryan Blue  wrote:
>
> Hi everyone,
>
>
>
> I'd like to congratulate Ratandeep Ratti, who was just invited to join
> the Iceberg committers adn PPMC!
>
>
>
> Thanks for your contributions and reviews, Ratandeep!
>
>
>
> rb
>
>
>
> --
>
> Ryan Blue
>
>
>
>
> --
>
> Best Regards
>
>
>
>
> --
>
>
>李响 Xiang Li
>
> 手机 cellphone :+86-136-8113-8972
> 邮件 e-mail  :wate...@gmail.com
>
>


Re: [DISCUSS] Graduating from the Apache Incubator

2020-05-11 Thread Driesprong, Fokko
+1

I'm excited about Iceberg as well. I see a lot of companies that benefit
from from Iceberg, or I'm sure they will in the future.

One concern I would be the set of committers:

Mentors:
 * Ryan Blue, 408 commits, rdblue
 * Julien Le Dem, 1 commit, julienledem
 * Owen O'Malley, 4 commits, omalley
 * James R. Taylor  , 0 commits, JamesRTaylor
 * Carl Steinbach   , 0 commits, cwsteinbach

Committers/PPMC:
 * Parth Brahmbhatt , 13 commits, Parth-Brahmbhatt
 * Daniel C. Weeks  , 7 commits, danielcweeks
 * Anton Okolnychyi , 63 commits, aokolnychyi
 * Ratandeep Ratti  , 26 commits, rdsr (missing on
https://incubator.apache.org/projects/iceberg.html)

Please correct me if I'm wrong here, but a more diverse set of people would
be desirable. However, for now, I would not consider this as a blocker.

Cheers, Fokko





Op ma 11 mei 2020 om 21:16 schreef Owen O'Malley :

> +1 to graduation. It is exciting watching the project and its community
> grow.
>
> .. Owen
>
> On Mon, May 11, 2020 at 11:26 AM Ryan Blue  wrote:
>
>> Hi everyone,
>>
>> I think that Iceberg is about ready to graduate from the Apache
>> Incubator. We now have 2 releases — that include convenience binaries — and
>> have added 2 committers/PPMC members and 2 PPMC members from the original
>> set of committers. We are seeing a steady rate of contributions from a
>> diverse group of people and companies interested in Iceberg. Thank you all
>> for your contributions and for being part of this community!
>>
>> The next step is to agree as a community that we would like to graduate.
>> If you have any concerns about graduation, please raise them.
>>
>> Below is the draft resolution for the board to create an Apache Iceberg
>> TLP. This is mostly boilerplate, but I’ve added 2 things:
>>
>>1. I’d like to volunteer to be the PMC chair of the project so I’ve
>>added myself to the draft. Others are welcome to volunteer as well and we
>>can decide as a community.
>>2. The project description I filled in is: software related to
>>“managing huge analytic datasets using a standard at-rest table format 
>> that
>>is designed for high performance and ease of use”.
>>
>> Establish the Apache Iceberg Project
>>
>> WHEREAS, the Board of Directors deems it to be in the best interests of
>> the Foundation and consistent with the Foundation's purpose to establish
>> a Project Management Committee charged with the creation and maintenance
>> of open-source software, for distribution at no charge to the public,
>> related to managing huge analytic datasets using a standard at-rest
>> table format that is designed for high performance and ease of use..
>>
>> NOW, THEREFORE, BE IT RESOLVED, that a Project Management Committee
>> (PMC), to be known as the "Apache Iceberg Project", be and hereby is
>> established pursuant to Bylaws of the Foundation; and be it further
>>
>> RESOLVED, that the Apache Iceberg Project be and hereby is responsible
>> for the creation and maintenance of software related to managing huge
>> analytic datasets using a standard at-rest table format that is designed
>> for high performance and ease of use; and be it further
>>
>> RESOLVED, that the office of "Vice President, Apache Iceberg" be and
>> hereby is created, the person holding such office to serve at the
>> direction of the Board of Directors as the chair of the Apache Iceberg
>> Project, and to have primary responsibility for management of the
>> projects within the scope of responsibility of the Apache Iceberg
>> Project; and be it further
>>
>> RESOLVED, that the persons listed immediately below be and hereby are
>> appointed to serve as the initial members of the Apache Iceberg Project:
>>
>>  * Anton Okolnychyi 
>>  * Carl Steinbach   
>>  * Daniel C. Weeks  
>>  * James R. Taylor  
>>  * Julien Le Dem
>>  * Owen O'Malley
>>  * Parth Brahmbhatt 
>>  * Ratandeep Ratti  
>>  * Ryan Blue
>>
>> NOW, THEREFORE, BE IT FURTHER RESOLVED, that Ryan Blue be appointed to
>> the office of Vice President, Apache Iceberg, to serve in accordance
>> with and subject to the direction of the Board of Directors and the
>> Bylaws of the Foundation until death, resignation, retirement, removal
>> or disqualification, or until a successor is appointed; and be it
>> further
>>
>> RESOLVED, that the Apache Iceberg Project be and hereby is tasked with
>> the migration and rationalization of the Apache Incubator Iceberg
>> podling; and be it further
>>
>> RESOLVED, that all responsibilities pertaining to the Apache Incubator
>> Iceberg podling encumbered upon the Apache Incubator PMC are hereafter
>> discharged.
>>
>> --
>> Ryan Blue
>>
>


Re: [DISCUSS] Graduating from the Apache Incubator

2020-05-12 Thread Driesprong, Fokko
Thanks, and I can confirm that the responses on PR's or mailing list are
swift.

Cheers, Fokko

Op ma 11 mei 2020 om 23:50 schreef Ryan Blue :

> I've updated the Incubator XML file to include Ratandeep. Sorry about
> that, I forgot this page doesn't show changes from Whimsy.
>
> I agree that it would be great to have more committers, but I don't think
> that it should block graduation. The aspects that need to be satisfied for
> graduation are diversity of the committers and ability to develop
> contributors to committers. We do well on both.
>
> For ability to develop contributors, I think we do an overall good job of
> making sure contributions get attention from committers and that we have
> quality reviews that provide context to help contributors gain
> understanding. And we've demonstrated the ability to recognize when a
> contributor has that context and grant the trust that goes along with it.
>
> For diversity, no company has more than 3 PMC members and two companies
> have that many. We also have representation from others, so that no one
> organization has a majority in the PMC. The community also includes not
> just our (active!) mentors, but also several other ASF members. And our
> contributor base is even more diverse because of the number of people and
> companies interested in the project.
>
> So while I agree that we would ideally have more committers, I think it is
> more a matter of time.
>
> On Mon, May 11, 2020 at 1:17 PM Driesprong, Fokko 
> wrote:
>
>> +1
>>
>> I'm excited about Iceberg as well. I see a lot of companies that benefit
>> from from Iceberg, or I'm sure they will in the future.
>>
>> One concern I would be the set of committers:
>>
>> Mentors:
>>  * Ryan Blue, 408 commits, rdblue
>>  * Julien Le Dem, 1 commit, julienledem
>>  * Owen O'Malley, 4 commits, omalley
>>  * James R. Taylor  , 0 commits, JamesRTaylor
>>  * Carl Steinbach   , 0 commits, cwsteinbach
>>
>> Committers/PPMC:
>>  * Parth Brahmbhatt , 13 commits, Parth-Brahmbhatt
>>  * Daniel C. Weeks  , 7 commits, danielcweeks
>>  * Anton Okolnychyi , 63 commits, aokolnychyi
>>  * Ratandeep Ratti  , 26 commits, rdsr (missing on
>> https://incubator.apache.org/projects/iceberg.html)
>>
>> Please correct me if I'm wrong here, but a more diverse set of people
>> would be desirable. However, for now, I would not consider this as a
>> blocker.
>>
>> Cheers, Fokko
>>
>>
>>
>>
>>
>> Op ma 11 mei 2020 om 21:16 schreef Owen O'Malley > >:
>>
>>> +1 to graduation. It is exciting watching the project and its community
>>> grow.
>>>
>>> .. Owen
>>>
>>> On Mon, May 11, 2020 at 11:26 AM Ryan Blue  wrote:
>>>
>>>> Hi everyone,
>>>>
>>>> I think that Iceberg is about ready to graduate from the Apache
>>>> Incubator. We now have 2 releases — that include convenience binaries — and
>>>> have added 2 committers/PPMC members and 2 PPMC members from the original
>>>> set of committers. We are seeing a steady rate of contributions from a
>>>> diverse group of people and companies interested in Iceberg. Thank you all
>>>> for your contributions and for being part of this community!
>>>>
>>>> The next step is to agree as a community that we would like to
>>>> graduate. If you have any concerns about graduation, please raise them.
>>>>
>>>> Below is the draft resolution for the board to create an Apache Iceberg
>>>> TLP. This is mostly boilerplate, but I’ve added 2 things:
>>>>
>>>>1. I’d like to volunteer to be the PMC chair of the project so I’ve
>>>>added myself to the draft. Others are welcome to volunteer as well and 
>>>> we
>>>>can decide as a community.
>>>>2. The project description I filled in is: software related to
>>>>“managing huge analytic datasets using a standard at-rest table format 
>>>> that
>>>>is designed for high performance and ease of use”.
>>>>
>>>> Establish the Apache Iceberg Project
>>>>
>>>> WHEREAS, the Board of Directors deems it to be in the best interests of
>>>> the Foundation and consistent with the Foundation's purpose to establish
>>>> a Project Management Committee charged with the creation and maintenance
>>>> of open-source software, for distribution at no charge to the public,
>>>> related to ma

Re: [VOTE] Graduate to a top-level project

2020-05-12 Thread Driesprong, Fokko
+1

Op wo 13 mei 2020 om 08:58 schreef jiantao yu 

> +1 for graduation.
>
>
> 在 2020年5月13日,下午12:50,Jun H.  写道:
>
> +1 for graduation.
>
>
> On Tue, May 12, 2020 at 9:41 PM 李响  wrote:
>
>
> +1 non-binding. My honor to be a part of this.
>
> On Wed, May 13, 2020 at 10:16 AM OpenInx  wrote:
>
>
> +1 for graduation.  It's a great news that we've prepared to graduate.
>
> (non-binding).
>
> On Wed, May 13, 2020 at 9:50 AM Saisai Shao 
> wrote:
>
>
> +1 for graduation.
>
> Junjie Chen  于2020年5月13日周三 上午9:33写道:
>
>
> +1
>
> On Wed, May 13, 2020 at 8:07 AM RD  wrote:
>
>
> +1 for graduation!
>
> On Tue, May 12, 2020 at 3:50 PM John Zhuge  wrote:
>
>
> +1
>
> On Tue, May 12, 2020 at 3:33 PM parth brahmbhatt <
> brahmbhatt.pa...@gmail.com> wrote:
>
>
> +1
>
> On Tue, May 12, 2020 at 3:31 PM Anton Okolnychyi
>  wrote:
>
>
> +1 for graduation
>
> On 12 May 2020, at 15:30, Ryan Blue  wrote:
>
> +1
>
> Jacques, I agree. I'll make sure to let you know about the IPMC vote
> because we'd love to have your support there as well.
>
> On Tue, May 12, 2020 at 3:02 PM Jacques Nadeau  wrote:
>
>
> I'm +1.
>
> (I think that is non-binding here but binding at the incubator level)
> --
> Jacques Nadeau
> CTO and Co-Founder, Dremio
>
>
> On Tue, May 12, 2020 at 2:35 PM Romin Parekh 
> wrote:
>
>
> +1
>
> On Tue, May 12, 2020 at 2:32 PM Owen O'Malley 
> wrote:
>
>
> +1
>
> On Tue, May 12, 2020 at 2:16 PM Ryan Blue  wrote:
>
>
> Hi everyone,
>
> I propose that the Iceberg community should petition to graduate from the
> Apache Incubator to a top-level project.
>
> Here is the draft board resolution:
>
> Establish the Apache Iceberg Project
>
> WHEREAS, the Board of Directors deems it to be in the best interests of
> the Foundation and consistent with the Foundation's purpose to establish
> a Project Management Committee charged with the creation and maintenance
> of open-source software, for distribution at no charge to the public,
> related to managing huge analytic datasets using a standard at-rest
> table format that is designed for high performance and ease of use..
>
> NOW, THEREFORE, BE IT RESOLVED, that a Project Management Committee
> (PMC), to be known as the "Apache Iceberg Project", be and hereby is
> established pursuant to Bylaws of the Foundation; and be it further
>
> RESOLVED, that the Apache Iceberg Project be and hereby is responsible
> for the creation and maintenance of software related to managing huge
> analytic datasets using a standard at-rest table format that is designed
> for high performance and ease of use; and be it further
>
> RESOLVED, that the office of "Vice President, Apache Iceberg" be and
> hereby is created, the person holding such office to serve at the
> direction of the Board of Directors as the chair of the Apache Iceberg
> Project, and to have primary responsibility for management of the
> projects within the scope of responsibility of the Apache Iceberg
> Project; and be it further
>
> RESOLVED, that the persons listed immediately below be and hereby are
> appointed to serve as the initial members of the Apache Iceberg Project:
>
> * Anton Okolnychyi 
> * Carl Steinbach   
> * Daniel C. Weeks  
> * James R. Taylor  
> * Julien Le Dem
> * Owen O'Malley
> * Parth Brahmbhatt 
> * Ratandeep Ratti  
> * Ryan Blue
>
> NOW, THEREFORE, BE IT FURTHER RESOLVED, that Ryan Blue be appointed to
> the office of Vice President, Apache Iceberg, to serve in accordance
> with and subject to the direction of the Board of Directors and the
> Bylaws of the Foundation until death, resignation, retirement, removal
> or disqualification, or until a successor is appointed; and be it
> further
>
> RESOLVED, that the Apache Iceberg Project be and hereby is tasked with
> the migration and rationalization of the Apache Incubator Iceberg
> podling; and be it further
>
> RESOLVED, that all responsibilities pertaining to the Apache Incubator
> Iceberg podling encumbered upon the Apache Incubator PMC are hereafter
> discharged.
>
> Please vote in the next 72 hours.
>
> [ ] +1 Petition the IPMC to graduate to top-level project
> [ ] +0
> [ ] -1 Wait to graduate because . . .
>
> --
> Ryan Blue
>
>
>
>
> --
> Thanks,
> Romin
>
>
>
>
> --
> Ryan Blue
> Software Engineer
> Netflix
>
>
>
>
> --
> John Zhuge
>
>
>
>
> --
> Best Regards
>
>
>
>
> --
>
>  李响 Xiang Li
>
> 手机 cellphone :+86-136-8113-8972
> 邮件 e-mail  :wate...@gmail.com
>
>
>


Re: New committer: Shardul Mahadik

2020-07-22 Thread Driesprong, Fokko
Congrats Shardul! Great work!

Cheers, Fokko

Op do 23 jul. 2020 om 07:46 schreef Miao Wang :

> Congratulations!
>
> Miao
>
> Sent from my iPhone
>
> > On Jul 22, 2020, at 8:08 PM, 俊杰陈  wrote:
> >
> > Congrats! Good job!
> >
> >> On Thu, Jul 23, 2020 at 11:01 AM Saisai Shao 
> wrote:
> >>
> >> Congrats!
> >>
> >> Thanks
> >> Saisai
> >>
> >> OpenInx  于2020年7月23日周四 上午10:06写道:
> >>>
> >>> Congratulations !
> >>>
> >>> On Thu, Jul 23, 2020 at 9:31 AM Jingsong Li 
> wrote:
> 
>  Congratulations Shardul! Well deserved!
> 
>  Best,
>  Jingsong
> 
>  On Thu, Jul 23, 2020 at 7:27 AM Anton Okolnychyi
>  wrote:
> >
> > Congrats and welcome! Keep up the good work!
> >
> > - Anton
> >
> > On 22 Jul 2020, at 16:02, RD  wrote:
> >
> > Congratulations Shardul! Well deserved!
> >
> > -Best,
> > R.
> >
> > On Wed, Jul 22, 2020 at 2:24 PM Ryan Blue  wrote:
> >>
> >> Hi everyone,
> >>
> >> I'd like to congratulate Shardul Mahadik, who was just invited to
> join the Iceberg committers!
> >>
> >> Thanks for all your contributions, Shardul!
> >>
> >> rb
> >>
> >>
> >> --
> >> Ryan Blue
> >
> >
> 
> 
>  --
>  Best, Jingsong Lee
> >
> >
> >
> > --
> > Thanks & Best Regards
>


Re: implementation in other languages

2021-01-03 Thread Driesprong, Fokko
Hi Casey,

1. Personally I don't know any other implementation than in the repository
itself, so JVM and Python.
2. There aren't any limitations. Most of the underlying tech used is
compatible with many languages. One thing to keep in mind is that you
should be able to read Avro files, which are also available in many
languages. Apart from that, the data itself is written in Avro/Parquet/ORC
which is also widely used and supported.

I've been working on an Apache Beam implementation (very early stage),
which also has a Go API. But that will take some time to get it to a
certain maturity level. Hope this helps!

Cheers, Fokko

Op zo 3 jan. 2021 om 19:39 schreef Casey Lucas :

> I'm just learning about Iceberg. Thanks for the work that has gone into it
> thus far. It seems like a promising technology.
>
> I understand why the core implementation is Java based and saw that there
> are some aspects of the spec implemented in Python.
>
> 1. Is anyone aware of implementations in other languages such as go?
> 2. Is there anything that would fundamentally limit the use of other
> implementations being able to read and write Iceberg data that is also
> being manipulated by the JVM implementation? I assume no based on reading
> the spec, but any insight is appreciated.
>
> I'm curious if reading and writing Iceberg data from a spec-compliant
> implementation without necessarily involving Presto or Spark is an expected
> use-case.
>
> Thanks,
>
> Casey Lucas
> Director, Engineering
>
> dynata.com
>
> The information contained in this e-mail message is intended
> for the use of the recipient(s) named above and is privileged and
> confidential. If you are not the intended recipient, you are formally
> notified that you have received this message in error and that any review,
> dissemination, distribution, or copying of the message is strictly
> prohibited. If you have received this communication in error, please
> notify us immediately by e-mail and delete the original message.
>


Re: Welcoming Peter Vary as a new committer!

2021-01-25 Thread Driesprong, Fokko
Congratulations!

Op ma 25 jan. 2021 om 21:16 schreef Mass Dosage 

> Nice one, well done Peter!
>
> On Mon, 25 Jan 2021 at 19:46, Daniel Weeks  wrote:
>
>> Congratulations, Peter!
>>
>> On Mon, Jan 25, 2021, 11:27 AM Jungtaek Lim 
>> wrote:
>>
>>> Congratulations Peter! Well deserved!
>>>
>>> On Tue, Jan 26, 2021 at 3:40 AM Wing Yew Poon
>>>  wrote:
>>>
 Congratulations Peter!


 On Mon, Jan 25, 2021 at 10:35 AM Russell Spitzer <
 russell.spit...@gmail.com> wrote:

> Congratulations!
>
> On Jan 25, 2021, at 12:34 PM, Jacques Nadeau 
> wrote:
>
> Congrats Peter! Thanks for all your great work
>
> On Mon, Jan 25, 2021 at 10:24 AM Ryan Blue  wrote:
>
>> Hi everyone,
>>
>> I'd like to welcome Peter Vary as a new Iceberg committer.
>>
>> Thanks for all your contributions, Peter!
>>
>> rb
>>
>> --
>> Ryan Blue
>>
>
>


Re: Welcome Fokko Driesprong as a committer!

2022-08-23 Thread Driesprong, Fokko
Thanks, everyone for the kind words. Honored to be a committer on the
Iceberg project and be part of this great community. Looking forward to
contributing more in the future!

Kind regards,
Fokko

Op di 23 aug. 2022 om 09:30 schreef Omar Al-Safi :

> Congrats!
>
> On Mon, Aug 22, 2022 at 11:13 PM Steve Zhang
>  wrote:
>
>> Congrats Fokko!
>>
>> Thanks,
>> Steve Zhang
>>
>>
>>
>> On Aug 22, 2022, at 2:05 PM, Szehon Ho  wrote:
>>
>> Congratulations!
>> Szehon
>>
>> On Mon, Aug 22, 2022 at 12:25 PM Péter Váry 
>> wrote:
>>
>>> Congratulations Fokko!
>>>
>>> On Mon, Aug 22, 2022, 16:37 Jahagirdar, Amogh <
>>> jaham...@amazon.com.invalid> wrote:
>>>
 Congratulations Fokko!



 *From: *Gabor Kaszab 
 *Reply-To: *"dev@iceberg.apache.org" 
 *Date: *Monday, August 22, 2022 at 4:10 AM
 *To: *"dev@iceberg.apache.org" 
 *Subject: *RE: [EXTERNAL]Welcome Fokko Driesprong as a committer!



 *CAUTION*: This email originated from outside of the organization. Do
 not click links or open attachments unless you can confirm the sender and
 know the content is safe.



 Congratulations!



 On Mon, Aug 22, 2022 at 7:21 AM Jun H.  wrote:

 Congratulations, Fokko!







 On Sun, Aug 21, 2022 at 10:00 PM Rajarshi Sarkar 
 wrote:

 Congratulations, Fokko!

 Regards,
 Rajarshi Sarkar





 On Mon, Aug 22, 2022 at 3:47 AM Steven Wu  wrote:

 Congratulations and thanks for pushing the Python SDK!



 On Sun, Aug 21, 2022 at 2:18 PM Austin Bennett <
 whatwouldausti...@gmail.com> wrote:

 Thanks, for all you've done and will continue to do, Fokko!



 On Sun, Aug 21, 2022 at 2:10 PM Holden Karau 
 wrote:



 Congratulations 🍾



 On Sun, Aug 21, 2022 at 1:57 PM  wrote:

 Congratulations!

 Sent from my iPhone



 On Aug 21, 2022, at 3:00 PM, Kyle Bendickson  wrote:

 Congratulations Fokko! This is indeed very well deserved! 🥳



 It’s a pleasure to work with you!



 On Sun, Aug 21, 2022 at 12:57 PM Sam Redai  wrote:

 Huge congrats Fokko, well deserved! 🎉



 On Sun, Aug 21, 2022 at 3:55 PM Ryan Blue  wrote:

 Hi everyone,



 I would like to welcome Fokko Driesprong as a new committer to the
 project!



 Thanks for all your contributions, Fokko!





 Ryan



 --

 Ryan Blue

 Tabular

 --
 [image: Image removed by sender.]
 Sam Redai 

 Developer Advocate  |  Tabular 

 --
 [image: Image removed by sender.]
 Kyle Bendickson

 OSS Developer  |  Tabular 



 k...@tabular.io

 --

 Twitter: https://twitter.com/holdenkarau

 Books (Learning Spark, High Performance Spark, etc.):
 https://amzn.to/2MaRAG9  

 YouTube Live Streams: https://www.youtube.com/user/holdenkarau


>>


Re: Welcome Yufei Gu as a committer

2022-08-26 Thread Driesprong, Fokko
Awesome! Congrats Yufei!

Cheers, Fokko

Op vr 26 aug. 2022 om 09:27 schreef Gidon Gershinsky :

> Congratulations, Yufei!
>
> Cheers, Gidon
>
>
> On Fri, Aug 26, 2022 at 7:30 AM Christiano Anderson 
> wrote:
>
>> Congrats, Yufei!
>>
>>
>>
>>
>> --- Original Message ---
>> On Friday, August 26th, 2022 at 01:20, Anton Okolnychyi
>>  wrote:
>>
>>
>> >
>> >
>> > I’d like to welcome Yufei Gu as a committer to the project.
>> >
>> > Thanks for all your hard work, Yufei!
>> >
>> > - Anton
>>
>


[VOTE] Release Apache PyIceberg 0.1.0rc1

2022-09-22 Thread Driesprong, Fokko
Hi Everyone,

I'm thrilled to propose that we release the following RC as the official
Apache PyIceberg 0.1.0 release.

The commit ID is 804c93cf1bf5a516556cad4eadbdf31d878069ce

This corresponds to the tag: pyiceberg-0.1.0rc1
(3e314eb0a0c6ec7c45490afd1a88c7a2cc985f69)

   - https://github.com/apache/iceberg/releases/tag/pyiceberg-0.1.0rc1
   -
   
https://github.com/apache/iceberg/tree/804c93cf1bf5a516556cad4eadbdf31d878069ce

The release tarball, signature, and checksums are here:

   - https://dist.apache.org/repos/dist/dev/iceberg/pyiceberg-0.1.0rc1/

You can find the KEYS file here:

   - https://dist.apache.org/repos/dist/dev/iceberg/KEYS

You can run the following to check the signature:
> wget https://dist.apache.org/repos/dist/dev/iceberg/KEYS
> gpg < KEYS
> gpg --verify pyiceberg-0.1.0rc1-py3-none-any.whl.asc
pyiceberg-0.1.0rc1-py3-none-any.whl
gpg: Signature made do 22 sep 11:08:59 2022 CEST
gpg:using RSA key FCD3779E399C53D995FC82A35171BA3E54493550
gpg:issuer "fo...@apache.org"
gpg: Good signature from "Fokko Driesprong " [ultimate]

And check the checksums:
> shasum -a 512  pyiceberg-0.1.0rc1.tar.gz
bd59d0c9184d222fc5a775fe7d2c99ffa2544bc4caf223a122e5835bbe34ad9cd86e25b2197c55285ead6642dd2a289a16f7aca56be07edef13ee07805d88f18
 pyiceberg-0.1.0rc1.tar.gz
> pyiceberg-0.1.0rc1.tar.gz.sha512
bd59d0c9184d222fc5a775fe7d2c99ffa2544bc4caf223a122e5835bbe34ad9cd86e25b2197c55285ead6642dd2a289a16f7aca56be07edef13ee07805d88f18
 dist/pyiceberg-0.1.0rc1.tar.gz

Convenience binary artifacts are staged on PyPI:
https://pypi.org/project/pyiceberg/0.1.0rc1/

And can be installed by providing the RC version explicitly: pip install
pyiceberg==0.1.0rc1

The easiest way to get familiar is by using the CLI to browse through your
catalog. Docs are available on https://py.iceberg.apache.org/

Testing the code can be done using Poetry:

> docker run -t -i python:3.9 bash
> wget
https://dist.apache.org/repos/dist/dev/iceberg/pyiceberg-0.1.0rc1/pyiceberg-0.1.0rc1.tar.gz
> tar -xf pyiceberg-0.1.0rc1.tar.gz
> cd pyiceberg-0.1.0rc1
> pip3 install poetry
> poetry install --all-extras
> poetry run pytest tests/ -m "not s3"

Please download, verify, and test.

Please vote in the next 72 hours.
[ ] +1 Release this as Apache PyIceberg 0.1.0.rc1
[ ] +0
[ ] -1 Do not release this because...

Please let me know if there are any questions, and looking forward to what
y'all think!

Kind regards,
Fokko


[VOTE] Release Apache PyIceberg 0.1.0 RC2

2022-09-24 Thread Driesprong, Fokko
Hi Everyone,

Thanks everyone for giving it a try and for the feedback. Much appreciated! I'm
canceling RC1 because the version of the package itself was tagged with
RC1. This doesn't allow us to release the code as is since we would have to
remove the RC postfix.

Other things to make the release smoother:

   - Include the Makefile to the source distribution to make the reviewing
   easier (see new commands below).
   - Include NOTICE to the source distribution.
   - Include a license checker in the source distribution to easily check
   the licenses.
   - Fixed the path in the checksum, so we can use shasum -c (see below).

I propose that we release the following RC as the official PyIceberg 0.1.0
release.

The commit ID is 83e3ab0b9fb57890d63130499e84c55b91fc0c17

   - This corresponds to the tag: pyiceberg-0.1.0rc2
   (289b4737d772260d7967c028bbb3f9a07e295ea8)
   - https://github.com/apache/iceberg/releases/tag/pyiceberg-0.1.0rc2
   -
   
https://github.com/apache/iceberg/tree/83e3ab0b9fb57890d63130499e84c55b91fc0c17
   - Difference between RC1 and RC2:
   
https://github.com/apache/iceberg/compare/pyiceberg-0.1.0rc1...pyiceberg-0.1.0rc2


The release tarball, signature, and checksums are here:
https://dist.apache.org/repos/dist/dev/iceberg/pyiceberg-0.1.0rc2/

You can find the KEYS file here:
https://dist.apache.org/repos/dist/dev/iceberg/KEYS

You can run the following to check the signature:
> wget https://dist.apache.org/repos/dist/dev/iceberg/KEYS
> gpg --import KEYS
> gpg --verify pyiceberg-0.1.0rc2.tar.gz.asc pyiceberg-0.1.0rc2.tar.gz
gpg: Signature made za 24 sep 21:07:12 2022 CEST
gpg:using RSA key FCD3779E399C53D995FC82A35171BA3E54493550
gpg: Good signature from "Fokko Driesprong " [ultimate]


And check the checksums:
> shasum -c pyiceberg-0.1.0.tar.gz.sha512
pyiceberg-0.1.0.tar.gz: OK

Convenience binary artifacts are staged on pypi:
https://pypi.org/project/pyiceberg/0.1.0rc2/

And can be installed using: pip3 install pyiceberg==0.1.0rc2

Testing can be done using:

> wget
https://dist.apache.org/repos/dist/dev/iceberg/pyiceberg-0.1.0rc2/pyiceberg-0.1.0.tar.gz
> tar -xf pyiceberg-0.1.0.tar.gz
> cd pyiceberg-0.1.0
> make check-license
> make install && make test

Please download, verify, and test.

Please vote in the next 96 hours (extended due to the weekend).
[ ] +1 Release this as PyIceberg 0.1.0
[ ] +0
[ ] -1 Do not release this because...

Please don't hesitate to reach out if there are any questions,

Kind regards,
Fokko


Re: [VOTE] Release Apache PyIceberg 0.1.0 RC2

2022-09-26 Thread Driesprong, Fokko
Thanks everyone for giving it a try.

I should have explained the version on PyPI. We need to add the RC postfix
to the version when we upload it to PyPI for testing. PyPI will extract the
version from the setup.py, and omitting the RC would mean an actual
release. The tarball will just contain the version without the RC.

To avoid confusion in the future, I would suggest to hardcode the version
instead of looking it up from the currently installed version:
https://github.com/apache/iceberg/pull/5854 This makes it more clear, the
only thing is that we need to bump two versions after a release. This is
very little effort and will reduce the ambiguity around the version. For
clarity, I also created a PR with the updated release instructions:
https://github.com/apache/iceberg/pull/5856

I would suggest a new RC after we decide on #5854

Thanks,
Fokko




Op ma 26 sep. 2022 om 07:30 schreef Steve Zhang
:

> +0 (non-binding and it’s just version needs to be fixed )
>
> Passing:
> Verified LICENSE in the tarball
> Checked sha512 sums and signatures
> Installed the CLI and ran basic commands with a hive metastore and AWS S3
> Ran tests (on Docker python 3.9 image some pyarrow tests failed w/
> permission issue but in local they are fine)
>
> Issues:
> - same version issue as Ryan pointed out
>
> Thanks,
> Steve Zhang
>
>
>
> On Sep 25, 2022, at 10:37 AM, Ryan Blue  wrote:
>
> +0
>
> Looks great, except that the version isn’t correct: pyiceberg.__version__
> returns 0.1.0rc2
>
> Passing:
>
>- Verified LICENSE and NOTICE content in the tarball and whl (nit:
>NOTICE and LICENSE are in different directories)
>- Checked sha512 sums and signatures
>- Ran RAT checks (nit: the poetry.lock file is not excluded if you
>create it)
>- Ran tests
>- Installed the CLI and ran basic commands with a REST metastore
>
> Issues:
>
>    - pyiceberg.__version__ returns 0.1.0rc2 instead of 0.1.0
>
>
> On Sat, Sep 24, 2022 at 12:51 PM Driesprong, Fokko 
> wrote:
>
>> Hi Everyone,
>>
>> Thanks everyone for giving it a try and for the feedback. Much
>> appreciated! I'm canceling RC1 because the version of the package itself
>> was tagged with RC1. This doesn't allow us to release the code as is since
>> we would have to remove the RC postfix.
>>
>> Other things to make the release smoother:
>>
>>- Include the Makefile to the source distribution to make the
>>reviewing easier (see new commands below).
>>- Include NOTICE to the source distribution.
>>- Include a license checker in the source distribution to easily
>>check the licenses.
>>- Fixed the path in the checksum, so we can use shasum -c (see below).
>>
>> I propose that we release the following RC as the official PyIceberg
>> 0.1.0 release.
>>
>> The commit ID is 83e3ab0b9fb57890d63130499e84c55b91fc0c17
>>
>>- This corresponds to the tag: pyiceberg-0.1.0rc2
>>(289b4737d772260d7967c028bbb3f9a07e295ea8)
>>- https://github.com/apache/iceberg/releases/tag/pyiceberg-0.1.0rc2
>>-
>>
>> https://github.com/apache/iceberg/tree/83e3ab0b9fb57890d63130499e84c55b91fc0c17
>>- Difference between RC1 and RC2:
>>
>> https://github.com/apache/iceberg/compare/pyiceberg-0.1.0rc1...pyiceberg-0.1.0rc2
>>
>>
>> The release tarball, signature, and checksums are here:
>> https://dist.apache.org/repos/dist/dev/iceberg/pyiceberg-0.1.0rc2/
>>
>> You can find the KEYS file here:
>> https://dist.apache.org/repos/dist/dev/iceberg/KEYS
>>
>> You can run the following to check the signature:
>> > wget https://dist.apache.org/repos/dist/dev/iceberg/KEYS
>> > gpg --import KEYS
>> > gpg --verify pyiceberg-0.1.0rc2.tar.gz.asc pyiceberg-0.1.0rc2.tar.gz
>> gpg: Signature made za 24 sep 21:07:12 2022 CEST
>> gpg:using RSA key FCD3779E399C53D995FC82A35171BA3E54493550
>> gpg: Good signature from "Fokko Driesprong " [ultimate]
>>
>>
>> And check the checksums:
>> > shasum -c pyiceberg-0.1.0.tar.gz.sha512
>> pyiceberg-0.1.0.tar.gz: OK
>>
>> Convenience binary artifacts are staged on pypi:
>> https://pypi.org/project/pyiceberg/0.1.0rc2/
>>
>> And can be installed using: pip3 install pyiceberg==0.1.0rc2
>>
>> Testing can be done using:
>>
>> > wget
>> https://dist.apache.org/repos/dist/dev/iceberg/pyiceberg-0.1.0rc2/pyiceberg-0.1.0.tar.gz
>> > tar -xf pyiceberg-0.1.0.tar.gz
>> > cd pyiceberg-0.1.0
>> > make check-license
>> > make install && make test
>>
>> Please download, verify, and test.
>>
>> Please vote in the next 96 hours (extended due to the weekend).
>> [ ] +1 Release this as PyIceberg 0.1.0
>> [ ] +0
>> [ ] -1 Do not release this because...
>>
>> Please don't hesitate to reach out if there are any questions,
>>
>> Kind regards,
>> Fokko
>>
>>
>
> --
> Ryan Blue
> Tabular
>
>
>


Re: [VOTE] Release Apache PyIceberg 0.1.0 RC2

2022-09-30 Thread Driesprong, Fokko
Hey Everyone,

Thanks all for checking the release, and we can conclude the vote:

Binding +1:
Ryan Blue
Daniel Weeks
Jack Ye
Anton Okolnychyi

Non-Binding +1:
Fokko Driesprong
Leilei Hu

Non-Binding +0:
Steve Zhang

I'll publish the artifacts right away. I also would like to thank everyone
for the feedback. A lot has been fixed already along the way, and I think
we should do a new release soon to also release these fixes to the public.

Kind regards,
Fokko Driesprong

Op vr 30 sep. 2022 om 08:43 schreef leilei hu :

> +1
>
> Ran checksum, checked license and signature, ran unit tests.
>
>
> Minor issues (non-blockers):
>  In https://pypi.org/project/pyiceberg/0.1.0rc2/,  I click the url “
> <https://pyiceberg.apache.org/>https://pyiceberg.apache.org.",I meet a
> minor question:Unable to access this site
>
>
> In addition, the README.md
> <https://github.com/apache/iceberg/blob/master/python/README.md> is a
> little simple. It is recommended to enrich it and specify the Python
> version information (<4.0,>=3.8).
>
>
>
> 2022年9月30日 下午12:50,Anton Okolnychyi  写道:
>
> +1
>
> - Anton
>
> On Sep 29, 2022, at 9:42 AM, Ye, Jack  wrote:
>
> +1
>
> Ran checksum, checked license and signature, ran unit tests.
> Ran against Hive catalog and S3 with CLI, tested create/load/drop/rename
> table and create/drop/load database.
>
> Best,
> Jack Ye
>
> *From: *Daniel Weeks 
> *Reply-To: *"dev@iceberg.apache.org" 
> *Date: *Wednesday, September 28, 2022 at 9:37 PM
> *To: *"dev@iceberg.apache.org" 
> *Subject: *RE: [EXTERNAL][VOTE] Release Apache PyIceberg 0.1.0 RC2
>
> +1
>
> I checked sigs/sums/license/tests.
> I ran through the CLI commands with REST Catalog and a few with Hive
> Metastore.
>
> Minor issues (non-blockers):
>   - Miss configuration with uri / credentials often resulted in confusing
> errors (asking to set the fields which were already supplied).
>   - I wasn't able to get the environment variables to work in some cases
> (possibly user error, command line arguments worked fine).
>
> A few minor notes on the verification process:
>   - some of the instructions (like gpg check) had RC reference, but that's
> not the binary being checked
>   - the license check is a little hard to know if it passed or not.  It
> would be great if it gave a pass/fail at the end
>
> On Mon, Sep 26, 2022 at 1:01 PM Ryan Blue  wrote:
>
> Thanks for the clarification, Fokko!
>
> I think it makes sense that I'd get an RC version from what was published
> as an RC on PyPI! Since we will publish a final artifact with the right
> version and none of the files in the release candidate are affected (it's
> correct in the tarball and whl files) then I'll change my vote to +1.
>
> Ryan
>
> On Mon, Sep 26, 2022 at 12:30 AM Driesprong, Fokko 
> wrote:
>
> Thanks everyone for giving it a try.
>
> I should have explained the version on PyPI. We need to add the RC postfix
> to the version when we upload it to PyPI for testing. PyPI will extract the
> version from the setup.py, and omitting the RC would mean an actual
> release. The tarball will just contain the version without the RC.
>
> To avoid confusion in the future, I would suggest to hardcode the version
> instead of looking it up from the currently installed version:
> https://github.com/apache/iceberg/pull/5854 This makes it more clear, the
> only thing is that we need to bump two versions after a release. This is
> very little effort and will reduce the ambiguity around the version. For
> clarity, I also created a PR with the updated release instructions:
> https://github.com/apache/iceberg/pull/5856
>
> I would suggest a new RC after we decide on #5854
>
> Thanks,
> Fokko
>
>
>
>
> Op ma 26 sep. 2022 om 07:30 schreef Steve Zhang <
> hongyue_zh...@apple.com.invalid>:
>
> +0 (non-binding and it’s just version needs to be fixed )
>
> Passing:
> Verified LICENSE in the tarball
> Checked sha512 sums and signatures
> Installed the CLI and ran basic commands with a hive metastore and AWS S3
> Ran tests (on Docker python 3.9 image some pyarrow tests failed w/
> permission issue but in local they are fine)
>
> Issues:
> - same version issue as Ryan pointed out
>
> Thanks,
> Steve Zhang
>
>
>
>
> On Sep 25, 2022, at 10:37 AM, Ryan Blue  wrote:
>
>
> +0
> Looks great, except that the version isn’t correct: pyiceberg.__version__
> returns 0.1.0rc2
>
> Passing:
>
>- Verified LICENSE and NOTICE content in the tarball and whl (nit:
>NOTICE and LICENSE are in different directories)
>- Checked sha512 sums and signatures
>- R

Re: [VOTE] Release Apache Iceberg 1.0.0 RC0

2022-10-10 Thread Driesprong, Fokko
+1 (non-binding)

- Ran the license and checksum
- Ran test suite of Trino against 1.0.0 and didn't see any issues

Cheers, Fokko


Op ma 10 okt. 2022 om 08:00 schreef Ajantha Bhat :

> +1 (non-binding)
>
>
>- Verified the Spark runtime jar contents.
>- Checked license docs, ran RAT checks.
>- Validated checksum and signature.
>
>
> Thanks,
> Ajantha
>
> On Mon, Oct 10, 2022 at 10:45 AM Prashant Singh 
> wrote:
>
>> Hello Everyone,
>>
>> Wanted to know your thoughts on whether we should also include the
>> following bug fixes in this release as well:
>>
>> 1. MERGE INTO nullability fix, leads to query failure otherwise:
>> *Reported instances :*
>> a.
>> https://stackoverflow.com/questions/73424454/spark-iceberg-merge-into-issue-caused-by-org-apache-spark-sql-analysisexcep
>> b. https://github.com/apache/iceberg/issues/5739
>> c. https://github.com/apache/iceberg/issues/5424#issuecomment-1220688298
>>
>> *PR's (Merged):*
>> a. https://github.com/apache/iceberg/pull/5880
>> b. https://github.com/apache/iceberg/pull/5679
>>
>> 2.  QueryFailure when running RewriteManifestProcedure on Date /
>> Timestamp partitioned table when
>> `spark.sql.datetime.java8API.enabled` is true.
>> *Reported instances :*
>> a. https://github.com/apache/iceberg/issues/5104
>> b.
>> https://apache-iceberg.slack.com/archives/C025PH0G1D4/p1663982635731469
>>
>> *PR* :
>> a. https://github.com/apache/iceberg/pull/5860
>>
>> Regards,
>> Prashant Singh
>>
>> On Mon, Oct 10, 2022 at 4:15 AM Ryan Blue  wrote:
>>
>>> +1 (binding)
>>>
>>>- Checked license docs, ran RAT checks
>>>- Validated checksum and signature
>>>- Built and tested with Java 11
>>>- Built binary artifacts with Java 8
>>>
>>>
>>> On Sun, Oct 9, 2022 at 3:42 PM Ryan Blue  wrote:
>>>
 Hi Everyone,

 I propose that we release the following RC as the official Apache
 Iceberg 1.0.0 release.

 The commit ID is e2bb9ad7e792efca419fa7c4a1afde7c4c44fa01
 * This corresponds to the tag: apache-iceberg-1.0.0-rc0
 * https://github.com/apache/iceberg/commits/apache-iceberg-1.0.0-rc0
 *
 https://github.com/apache/iceberg/tree/e2bb9ad7e792efca419fa7c4a1afde7c4c44fa01

 The release tarball, signature, and checksums are here:
 *
 https://dist.apache.org/repos/dist/dev/iceberg/apache-iceberg-1.0.0-rc0

 You can find the KEYS file here:
 * https://dist.apache.org/repos/dist/dev/iceberg/KEYS

 Convenience binary artifacts are staged on Nexus. The Maven repository
 URL is:
 *
 https://repository.apache.org/content/repositories/orgapacheiceberg-1106/

 Please download, verify, and test.

 This release is based on the latest 0.14.1 release. It includes changes
 to remove deprecated APIs and the following additional bug fixes:
 * Increase metrics limit to 100 columns
 * Bump Spark patch versions for CVE-2022-33891
 * Exclude Scala from Spark runtime Jars

 Please vote in the next 72 hours.

 [ ] +1 Release this as Apache Iceberg 1.0.0
 [ ] +0
 [ ] -1 Do not release this because...


 --
 Ryan Blue

>>>
>>>
>>> --
>>> Ryan Blue
>>>
>>


Re: [VOTE] Release Apache Iceberg 1.1.0 RC2

2022-11-17 Thread Driesprong, Fokko
-1 (non-binding)

For testing the release I plugged in the latest RC2 in Trino and found out
that we have regression: https://github.com/trinodb/trino/pull/15079/files

It throws this exception:

java.lang.UnsupportedOperationException: hash(value) is not supported on
the base Bucket class
at org.apache.iceberg.transforms.Bucket.hash(Bucket.java:90)
at org.apache.iceberg.transforms.Bucket.apply(Bucket.java:99)
at org.apache.iceberg.transforms.Bucket.apply(Bucket.java:38)
at
io.trino.plugin.iceberg.IcebergPageSink.applyTransform(IcebergPageSink.java:372)
at
io.trino.plugin.iceberg.IcebergPageSink.getPartitionData(IcebergPageSink.java:364)
at
io.trino.plugin.iceberg.IcebergPageSink.getWriterIndexes(IcebergPageSink.java:288)
at
io.trino.plugin.iceberg.IcebergPageSink.writePage(IcebergPageSink.java:215)
at
io.trino.plugin.iceberg.IcebergPageSink.doAppend(IcebergPageSink.java:210)
at
io.trino.plugin.iceberg.IcebergPageSink.appendPage(IcebergPageSink.java:161)

We've removed the source type from the transform
, so we can do lazy binding
which is very nice, but this wasn't deprecated in 1.0.0
.
I'm not sure if we can make this backward compatible without diving into
the details, mostly because we've also built on top of it
.

Best, Fokko




Op do 17 nov. 2022 om 18:16 schreef Eduard Tudenhoefner :

> +1 (non-binding)
>
>- validated checksum and signature
>- checked license docs & ran RAT checks
>- ran build and tests with JDK11
>
>
> On Thu, Nov 17, 2022 at 11:30 AM Gabor Kaszab 
> wrote:
>
>> Hi Everyone,
>>
>> I propose that we release the following RC as the official Apache Iceberg 
>> 1.1.0 release.
>>
>> The commit ID is b3eaf0c6cb9cf6357a925c7443baadd54515a971
>> * This corresponds to the tag: apache-iceberg-1.1.0-rc2
>> * https://github.com/apache/iceberg/commits/apache-iceberg-1.1.0-rc2
>> * 
>> https://github.com/apache/iceberg/tree/b3eaf0c6cb9cf6357a925c7443baadd54515a971
>>
>> The release tarball, signature, and checksums are here:
>> * https://dist.apache.org/repos/dist/dev/iceberg/apache-iceberg-1.1.0-rc2
>>
>> You can find the KEYS file here:
>> * https://dist.apache.org/repos/dist/dev/iceberg/KEYS
>>
>> Convenience binary artifacts are staged on Nexus. The Maven repository URL 
>> is:
>> * https://repository.apache.org/content/repositories/orgapacheiceberg-/
>>
>> Please download, verify, and test.
>>
>> Please vote in the next 72 hours.
>>
>> [ ] +1 Release this as Apache Iceberg 1.1.0
>> [ ] +0
>> [ ] -1 Do not release this because...
>>
>> Again, thanks to *Fokko* for running the RC creation steps for me!
>>
>>


Re: [VOTE] Release Apache Iceberg 1.1.0 RC2

2022-11-19 Thread Driesprong, Fokko
Hey everyone,

Wanted to let you know that I got a working fix for the breaking API:
https://github.com/apache/iceberg/pull/6220 Let me know what you think.

Kind regards,
Fokko Driesprong

Op za 19 nov. 2022 om 14:20 schreef leilei hu :

> +1(non-binding)
> verified(java 8):
>
> - Create table using HiveCatalog and HadoopCatalog
> - Spark Structured Streaming with Spark 3.2.1
> - Spark query with Spark’s DataSourceV2 API
> - Ran build with JDK8
>
> 2022年11月18日 上午12:39,Szehon Ho  写道:
>
> +1 (non-binding)
>
>
>


Re: [VOTE] Release Apache Iceberg 1.1.0 RC2

2022-11-20 Thread Driesprong, Fokko
Thanks! Would be great to get this one in before RC3 as well:
https://github.com/apache/iceberg/pull/6195

Op zo 20 nov. 2022 om 21:03 schreef Ryan Blue :

> I just merged Fokko's fix for the breaking change, so we should be
> unblocked now. Thanks, Fokko!
>
> On Sat, Nov 19, 2022 at 3:30 PM Ryan Blue  wrote:
>
>> Thanks, Fokko! I just reviewed the PR and it's almost ready to go.
>>
>> On Sat, Nov 19, 2022 at 3:01 PM Driesprong, Fokko 
>> wrote:
>>
>>> Hey everyone,
>>>
>>> Wanted to let you know that I got a working fix for the breaking API:
>>> https://github.com/apache/iceberg/pull/6220 Let me know what you think.
>>>
>>> Kind regards,
>>> Fokko Driesprong
>>>
>>> Op za 19 nov. 2022 om 14:20 schreef leilei hu :
>>>
>>>> +1(non-binding)
>>>> verified(java 8):
>>>>
>>>> - Create table using HiveCatalog and HadoopCatalog
>>>> - Spark Structured Streaming with Spark 3.2.1
>>>> - Spark query with Spark’s DataSourceV2 API
>>>> - Ran build with JDK8
>>>>
>>>> 2022年11月18日 上午12:39,Szehon Ho  写道:
>>>>
>>>> +1 (non-binding)
>>>>
>>>>
>>>>
>>
>> --
>> Ryan Blue
>> Tabular
>>
>
>
> --
> Ryan Blue
> Tabular
>


Re: [VOTE] Release Apache Iceberg 1.1.0 RC2

2022-11-21 Thread Driesprong, Fokko
Hey everyone,

Unfortunately, I caught another issue that introduces regression:
https://github.com/apache/iceberg/pull/6242 Once that's in, we can cut
another RC.

Kind regards,
Fokko Driesprong

Op ma 21 nov. 2022 om 11:31 schreef Gabor Kaszab :

> Hi Everyone,
>
> I propose that we release the following RC as the official Apache Iceberg 
> 1.1.0 release.
>
> The commit ID is 132cb94fea66644bdefe0c3608693cf317a72f3f
> * This corresponds to the tag: apache-iceberg-1.1.0-rc3
> * https://github.com/apache/iceberg/commits/apache-iceberg-1.1.0-rc3
> * 
> https://github.com/apache/iceberg/tree/132cb94fea66644bdefe0c3608693cf317a72f3f
>
> The release tarball, signature, and checksums are here:
> * https://dist.apache.org/repos/dist/dev/iceberg/apache-iceberg-1.1.0-rc3
>
> You can find the KEYS file here:
> * https://dist.apache.org/repos/dist/dev/iceberg/KEYS
>
> Convenience binary artifacts are staged on Nexus. The Maven repository URL is:
> * https://repository.apache.org/content/repositories/orgapacheiceberg-1113/
>
> Please download, verify, and test.
>
> Please vote in the next 72 hours (not counting weekends).
>
> [ ] +1 Release this as Apache Iceberg 1.1.0
> [ ] +0
> [ ] -1 Do not release this because...
>
>


Re: [VOTE] Release Apache Iceberg 1.1.0 RC4

2022-11-24 Thread Driesprong, Fokko
Hey everyone!

First of all, happy thanksgiving!

+1 (non-binding)

It looks good now on the Trino side. Still some tests are failing
, but that's explainable. As
an example, the following test is still failing:
https://github.com/trinodb/trino/blob/ed2f14ce92a67fd5c951d6258a2d1e9d4540d546/plugin/trino-iceberg/src/test/java/io/trino/plugin/iceberg/TestIcebergMetadataFileOperations.java#L229-L241

With the output:

Expecting:
  <[FileOperation{fileType=SNAPSHOT, operationType=INPUT_FILE_GET_LENGTH},
FileOperation{fileType=MANIFEST, operationType=INPUT_FILE_GET_LENGTH},
FileOperation{fileType=SNAPSHOT, operationType=INPUT_FILE_NEW_STREAM},
FileOperation{fileType=METADATA_JSON,
operationType=INPUT_FILE_NEW_STREAM},
FileOperation{fileType=MANIFEST, operationType=INPUT_FILE_NEW_STREAM}]>
to contain exactly in any order:
  <[FileOperation{fileType=MANIFEST, operationType=INPUT_FILE_GET_LENGTH},
FileOperation{fileType=MANIFEST, operationType=INPUT_FILE_GET_LENGTH},
FileOperation{fileType=MANIFEST, operationType=INPUT_FILE_NEW_STREAM},
FileOperation{fileType=MANIFEST, operationType=INPUT_FILE_NEW_STREAM},
FileOperation{fileType=METADATA_JSON,
operationType=INPUT_FILE_NEW_STREAM},
FileOperation{fileType=SNAPSHOT, operationType=INPUT_FILE_GET_LENGTH},
FileOperation{fileType=SNAPSHOT, operationType=INPUT_FILE_NEW_STREAM}]>
but could not find the following elements:
  <[FileOperation{fileType=MANIFEST, operationType=INPUT_FILE_GET_LENGTH},
FileOperation{fileType=MANIFEST, operationType=INPUT_FILE_NEW_STREAM}]>

It looks like we're reading fewer manifests. After running a git bisect, I
narrowed it down to this PR: https://github.com/apache/iceberg/pull/5632,
which is an optimization that makes perfect sense.

Kind regards,
Fokko Driesprong


Op wo 23 nov. 2022 om 17:15 schreef Eduard Tudenhoefner :

> +1 (non-binding)
>
>- validated checksum and signature
>- checked license docs & ran RAT checks
>- ran build and tests with JDK11
>- integrated 1.1.0 RC4 into Presto
>
>
>
>
> On Wed, Nov 23, 2022 at 9:14 AM Ajantha Bhat 
> wrote:
>
>> +1 (non-binding)
>>
>> - verified tests against spark-3.3 runtime jar with Nessie catalog.
>> - verified the contents of the iceberg-spark-runtime-3.3_2.12-1.1.0.jar
>> - checked for spark-3.0 removal
>> - validated checksum and signature
>> - checked license docs & ran RAT checks
>> - ran build with JDK1.8
>>
>> Thanks,
>> Ajantha
>>
>> On Tue, Nov 22, 2022 at 9:49 PM Gabor Kaszab 
>> wrote:
>>
>>> Hi Everyone,
>>>
>>> I propose that we release the following RC as the official Apache Iceberg 
>>> 1.1.0 release.
>>>
>>> The commit ID is ede085d0f7529f24acd0c81dd0a43f7bb969b763
>>> * This corresponds to the tag: apache-iceberg-1.1.0-rc4
>>> * https://github.com/apache/iceberg/commits/apache-iceberg-1.1.0-rc4
>>> * 
>>> https://github.com/apache/iceberg/tree/ede085d0f7529f24acd0c81dd0a43f7bb969b763
>>>
>>> The release tarball, signature, and checksums are here:
>>> * https://dist.apache.org/repos/dist/dev/iceberg/apache-iceberg-1.1.0-rc4
>>>
>>> You can find the KEYS file here:
>>> * https://dist.apache.org/repos/dist/dev/iceberg/KEYS
>>>
>>> Convenience binary artifacts are staged on Nexus. The Maven repository URL 
>>> is:
>>> * https://repository.apache.org/content/repositories/orgapacheiceberg-1114/
>>>
>>> Please download, verify, and test.
>>>
>>> Please vote in the next 72 hours.
>>>
>>> [ ] +1 Release this as Apache Iceberg 1.1.0
>>> [ ] +0
>>> [ ] -1 Do not release this because...
>>>
>>>


Re: [VOTE] Release Apache Iceberg 1.1.0 RC4

2022-11-28 Thread Driesprong, Fokko
Hi Piotr,

Thanks for asking. It is indeed the creation time as Ajantha already
indicated. I released the RC4 artifact as the final version
 earlier
today after Gabor concluded the vote. The tag and release are published on
Github, and we're working on the steps of the documentation.

Kind regards,
Fokko

Op ma 28 nov. 2022 om 19:22 schreef Ajantha Bhat :

> I think it shows the jars created time. Not the published time.
> I can see the same thing for previous releases too.
> (for example, 1.0.0 rc0 voting started on oct-9 and voting passed on
> oct-14 and the jars are dated to oct-9 itself)
>
> Also, I think that jars are pushed to maven central today itself not
> before because dependabot raised iceberg bump PR today not before.
>
> Thanks,
> Ajantha
>
> On Mon, Nov 28, 2022 at 11:01 PM Ryan Blue  wrote:
>
>> Piotr, thanks for catching that. I'm not sure what happened, but the
>> Jar's git commit ID is ede085d0f7529f24acd0c81dd0a43f7bb969b763, so it
>> corresponds to RC4 that passed.
>>
>> On Mon, Nov 28, 2022 at 9:07 AM Piotr Findeisen 
>> wrote:
>>
>>> Hi,
>>>
>>>
>>> https://repo.maven.apache.org/maven2/org/apache/iceberg/iceberg-core/1.1.0/
>>> is already published (on Nov 22, so before voting was concluded)
>>> Is it "the" release, or there will be new tag pushed to maven central?
>>>
>>> best,
>>> PF
>>>
>>>
>>> On Mon, Nov 28, 2022 at 9:18 AM Gabor Kaszab 
>>> wrote:
>>>

 Hey All,

 I kept the voting open longer than the usual 72 hours due to
 Thanksgiving in the US. We now have a critical mass and the *vote
 passes* with the following results:

 +1: 13 (including 3 binding)
 without having any 0 or -1.

 Thanks to everyone who verified the release and voted!
 Also huge thanks to *Fokko* for creating all those release candidates!

 I'll proceed with the next steps right away.

 Cheers,
 Gabor

 On Mon, Nov 28, 2022 at 1:07 AM OpenInx  wrote:

> +1 (binding)
>
> 1. Download the source tarball, signature (.asc), and checksum
> (.sha512):   OK
> 2. Import gpg keys: download KEYS and run gpg --import
> /path/to/downloaded/KEYS.txt (optional if this hasn’t changed) :  OK
> 3. Verify the signature by running: gpg --verify
> apache-iceberg-1.1.0.tar.gz.asc :  OK
> 4. Verify the checksum by running: shasum -a 512
> apache-iceberg-1.1.0.tar.gz  :  OK
> 5. Untar the archive and go into the source directory: tar xzvf
> apache-iceberg-1.1.0.tar.gz && cd apache-iceberg-1.1.0:  OK
> 6. Run RAT checks to validate license headers: dev/check-license: OK
> 7. Build and test the project: ./gradlew build (use Java 8) :   OK
>
> Thanks.
>
> On Mon, Nov 28, 2022 at 3:05 AM Daniel Weeks 
> wrote:
>
>> +1 (binding)
>>
>> Verified sigs/sums/licenses/build/test
>> Built and tested with JDK 8
>>
>> -Dan
>>
>> On Fri, Nov 25, 2022 at 5:58 PM leilei hu 
>> wrote:
>>
>>> +1(non-binding)
>>> verified(java 8):
>>>
>>> - Create table using HiveCatalog and HadoopCatalog
>>> - Spark Structured Streaming with Spark 3.2.1
>>> - Spark query with Spark’s DataSourceV2 API
>>> - Ran build with JDK8
>>>
>>> 2022年11月24日 上午12:13,Cheng Pan  写道:
>>>
>>> +1 (non-binding)
>>>
>>> Passed integration test[1] w/ Apache Kyuubi
>>>
>>> [1] https://github.com/apache/incubator-kyuubi/pull/3810
>>>
>>> Thanks,
>>> Cheng Pan
>>>
>>>
>>> On Nov 23, 2022 at 16:13:34, Ajantha Bhat 
>>> wrote:
>>>
 +1 (non-binding)

 - verified tests against spark-3.3 runtime jar with Nessie catalog.
 - verified the contents of the
 iceberg-spark-runtime-3.3_2.12-1.1.0.jar
 - checked for spark-3.0 removal
 - validated checksum and signature
 - checked license docs & ran RAT checks
 - ran build with JDK1.8

 Thanks,
 Ajantha

 On Tue, Nov 22, 2022 at 9:49 PM Gabor Kaszab <
 gaborkas...@apache.org> wrote:

> Hi Everyone,
>
> I propose that we release the following RC as the official Apache 
> Iceberg 1.1.0 release.
>
> The commit ID is ede085d0f7529f24acd0c81dd0a43f7bb969b763
> * This corresponds to the tag: apache-iceberg-1.1.0-rc4
> * https://github.com/apache/iceberg/commits/apache-iceberg-1.1.0-rc4
> * 
> https://github.com/apache/iceberg/tree/ede085d0f7529f24acd0c81dd0a43f7bb969b763
>
> The release tarball, signature, and checksums are here:
> * 
> https://dist.apache.org/repos/dist/dev/iceberg/apache-iceberg-1.1.0-rc4
>
> You can find the KEYS file here:
> * https://dist.apache.org/repos/dist/dev/iceberg/KEYS
>
> Convenience binary artifacts a

Re: [VOTE] Release Apache PyIceberg 0.2.0

2022-12-02 Thread Driesprong, Fokko
Hey Jun, Ryan,

Thanks Jun for sending out the first RC, and thanks Ryan for giving it
a try right away.

* RAT checks were difficult to read and included the downloaded RAT Jar
> file (we should use the Iceberg script)

Created a PR: https://github.com/apache/iceberg/pull/6348

* pip install failed at first, but succeeded later. The installed version
> complained that pydantic was missing even though pip claims it is installed
> (maybe just me?)

Would you happen to have any more information? I've tried in a new docker
container against different Python versions but wasn't able to reproduce
this.

* make lint fails because there is no .git directory

This one is more complicated than it seems at first glance. Pre-commit
relies on git to list the files to check. This way you can also lint the
files you have changed. However, just initializing an empty repository
doesn't work, since we need to have the `python/` subdirectory similar to
the iceberg repository. The easiest way to fix this would be to split out
PyIceberg into its own repository.

ArrowNotImplementedError: Function 'greater_equal' has no kernel matching
> input types (timestamp[us, tz=+00:00], int64)
>
Great catch! I was able to reproduce this and created a fix:
https://github.com/apache/iceberg/pull/6346

Kind regards,
Fokko Driesprong

Op vr 2 dec. 2022 om 02:38 schreef Ryan Blue :

> +0
>
> Checksum, signature, tests, and RAT passed, but there were a couple of
> issues:
> * RAT checks were difficult to read and included the downloaded RAT Jar
> file (we should use the Iceberg script)
> * pip install failed at first, but succeeded later. The installed version
> complained that pydantic was missing even though pip claims it is installed
> (maybe just me?)
> * make lint fails because there is no .git directory
>
> I also converted a scan to Arrow, but got an error when trying to filter
> by a timestamp:
> ArrowNotImplementedError: Function 'greater_equal' has no kernel matching
> input types (timestamp[us, tz=+00:00], int64)
>
>
> On Thu, Dec 1, 2022 at 4:15 PM Jun H.  wrote:
>
>> Hi Everyone,
>>
>> I propose that we release the following RC as the official PyIceberg
>> 0.2.0 release.
>>
>> The commit ID is af0f35258b9caadb0bfc5311c224c2facc0f18aa
>> * This corresponds to the tag: pyiceberg-0.2.0rc0
>> (c5928c46a12be07b99a1f115373bc65e6a2d2afe)
>> * https://github.com/apache/iceberg/releases/tag/pyiceberg-0.2.0rc0
>> *
>> https://github.com/apache/iceberg/tree/af0f35258b9caadb0bfc5311c224c2facc0f18aa
>>
>> The release tarball, signature, and checksums are here:
>> * https://dist.apache.org/repos/dist/dev/iceberg/pyiceberg-0.2.0rc0/
>>
>> You can find the KEYS file here:
>> * https://dist.apache.org/repos/dist/dev/iceberg/KEYS
>>
>> Convenience binary artifacts are staged on pypi:
>> https://pypi.org/project/pyiceberg/0.2.0rc0/
>>
>> And can be installed using: pip3 install pyiceberg==0.2.0rc0
>>
>> Please download, verify, and test.
>>
>> Please vote in the next 72 hours.
>>
>> [ ] +1 Release this as PyIceberg 0.2.0
>> [ ] +0
>> [ ] -1 Do not release this because...
>>
>>
>>
>
> --
> Ryan Blue
> Tabular
>


Re: In Remembrance of Kyle

2022-12-05 Thread Driesprong, Fokko
Sad to hear this awful news. I remember Kyle as someone that always took
the time to help everyone out. I've learned a lot from him, and it was a
privilege to work with him. Each time I work on the open-api spec, he puts
a smile on my face.

He will be truly missed.

Fokko

Op ma 5 dec. 2022 om 19:51 schreef huaxin gao :

> I am extremely shocked and saddened to hear of Kyle's passing.
>
> When I made my very first Iceberg PR last August, Kyle reviewed it
> immediately and helped me on it, and he did so for almost all my PRs.
>
> I pulled out a couple of my old PRs just now and re-read his comments. He
> liked to put smiling faces in his comments. These little smiling faces
> always made me feel welcomed and brightened my heart.
>
> Kyle, you will be truly missed and always remembered!
>
>
>
> On Mon, Dec 5, 2022 at 9:54 AM Yufei Gu  wrote:
>
>> I am sorry to hear about Kyle's passing. He was a wonderful person and a
>> valuable member of the open-source community. It's always difficult to lose
>> someone who has made such a positive impact on our lives. I remember
>> working with him on several projects and being impressed by his technical
>> knowledge, his passion and his humbleness. He will be greatly missed by all
>> who knew him. Thank you for sharing your memories of Kyle and for offering
>> to pass them on to his family.
>>
>>
>> Best,
>>
>> Yufei
>>
>> `This is not a contribution`
>>
>>
>> On Mon, Dec 5, 2022 at 8:36 AM Steven Wu  wrote:
>>
>>> This is such tough news. I had the fortune to work with him closely in
>>> the OSS community. His enthusiasm and care for the community is contagious.
>>> He is humble and inclusive. Made some many contributions. He will be deeply
>>> missed and his impact lives on.
>>>
>>> On Mon, Dec 5, 2022 at 8:32 AM Eduard Tudenhoefner 
>>> wrote:
>>>
 Kyle was a really great friend and colleague. I've enjoyed working with
 him in the OSS community and at Tabular in particular. He was a welcoming
 and open person and helped the OSS community in so many different ways. He
 will be missed!

 On Mon, Dec 5, 2022 at 4:18 PM Russell Spitzer <
 russell.spit...@gmail.com> wrote:

> Kyle always made my day a little brighter. He was a great friend and
> welcoming member of this community. I know I'll miss having him here. I'm
> glad I had the privilege of getting to work with him both at Apple and
> within the OSS world.
>
> On Mon, Dec 5, 2022 at 9:12 AM Holden Karau 
> wrote:
>
>> Hi Everyone,
>>
>> I know that many (if not all) of us have had the chance to work with
>> Kyle  and get to know him and his
>> wonderful sense of humor. Unfortunately, he passed away recently. I feel
>> fortunate to have had the opportunity to work with him and be friends. He
>> did some amazing work in Iceberg and cared about making our open-source
>> communities more welcoming and inclusive. He will be missed greatly. Next
>> time I am in Catalonia, I will raise a glass in his memory in a gay bar.
>>
>> If it would help anyone to share their memories, Kyle I know I'd love
>> to hear them and I'll collect them and pass them on to his family.
>>
>> Take care of each other.
>>
>> Hugs & Love,
>>
>> Holden
>>
>


Re: [VOTE] Release Apache PyIceberg 0.2.0

2022-12-05 Thread Driesprong, Fokko
+1 (non-binding)

Checked the signatures, checksums, and licenses. Ran a couple of scans
which worked fine.

The test_missing_uri test failing is likely as Ryan mentioned, it picks up
your local config file. I run the tests using a fresh docker container, but
would be nice to isolate this in the future
. Dan is right on the hard
dependency. I've created a fix here
. Since PyArrow is required
for the current read path, this went unnoticed



Op ma 5 dec. 2022 om 19:44 schreef Ryan Blue :

> Dan, the test_missing_uri failure is because you have a default catalog in
> your .pyiceberg.yaml file. I hit that all the time, too.
>
> On Mon, Dec 5, 2022 at 10:16 AM Daniel Weeks 
> wrote:
>
>> +1 (binding)
>>
>> I verified:
>>   - sigs, sums, licenses, tests
>>
>> I tested:
>>   - REST Catalog implementation
>>   - creating/loading/renaming/dropping tables
>>   - creating/removing namespace properties
>>   - loading table to arrow dataframe via scan (and pandas via arrow)
>>   - loading table to duckdb
>>
>> I had one test failure locally (though it appears to pass in CI, so I
>> consider this a non-blocker unless it is affecting everyone)
>> test_missing_uri
>> def test_missing_uri():
>> runner = CliRunner()
>> result = runner.invoke(run, ["list"])
>> >   assert result.exit_code == 1
>> E   assert 0 == 1
>> E+  where 0 = .exit_code
>>
>> tests/cli/test_console.py:140: AssertionError
>> FAILED tests/cli/test_console.py::test_missing_uri - assert 0 == 1
>>
>> A few things I noted:
>>  - I cannot import 'load_catalog' without the pyarrow dependency.  So
>> that seems like a hard dependency at this point and we should workaround it
>> in the future.
>>
>> On Mon, Dec 5, 2022 at 9:05 AM Jun H.  wrote:
>>
>>> Hi Everyone,
>>>
>>> I propose that we release the following RC as the official PyIceberg
>>> 0.2.0 release.
>>>
>>> The commit ID is 577867e88da86ab70f4efcb12ab993d01062712a
>>> * This corresponds to the tag: pyiceberg-0.2.0rc1
>>> (509a38cc1f08f399712c1e0a65e53f1ad7749153)
>>> * https://github.com/apache/iceberg/releases/tag/pyiceberg-0.2.0rc1
>>> *
>>> https://github.com/apache/iceberg/tree/577867e88da86ab70f4efcb12ab993d01062712a
>>>
>>> The release tarball, signature, and checksums are here:
>>> * https://dist.apache.org/repos/dist/dev/iceberg/pyiceberg-0.2.0rc1/
>>>
>>> You can find the KEYS file here:
>>> * https://dist.apache.org/repos/dist/dev/iceberg/KEYS
>>>
>>> Convenience binary artifacts are staged on pypi:
>>> https://pypi.org/project/pyiceberg/0.2.0rc1/
>>>
>>> And can be installed using: pip3 install pyiceberg==0.2.0rc1
>>>
>>> Please download, verify, and test.
>>>
>>> Please vote in the next 72 hours.
>>>
>>> [ ] +1 Release this as PyIceberg 0.2.0
>>> [ ] +0
>>> [ ] -1 Do not release this because...
>>>
>>>
>
> --
> Ryan Blue
> Tabular
>


[VOTE] Release Apache PyIceberg 0.2.1

2022-12-27 Thread Driesprong, Fokko
Hi Everyone,


First of all, I hope y'all had a great Christmas!


Last week we fixed  an issue
 in PyIceberg that breaks
filtering (and partitioning) on date fields. As discussed in the Iceberg
sync, I propose that we release the following RC as the official PyIceberg
0.2.1 release.


This adds the following commits on top of 0.2.0:


   - Python: Read date as an int #6487
   
   - Python: Bump version to 0.2.1 #6483
   
   - Python: Fix reading UUIDs #6486
   
   - Python: Fix PyArrow import #6484
   

Instructions on how to verify the release can be found here
.


The commit ID is d48b60e71232fe5c27ca4f4238adeefa1faa9ddf


* This corresponds to the tag: pyiceberg-0.2.1rc0
(072c7b6806f13ea314e1293d71fc9243b6b4be50)

* https://github.com/apache/iceberg/releases/tag/pyiceberg-0.2.1rc0

*
https://github.com/apache/iceberg/tree/d48b60e71232fe5c27ca4f4238adeefa1faa9ddf


The release tarball, signature, and checksums are here:


* https://dist.apache.org/repos/dist/dev/iceberg/pyiceberg-0.2.1rc0/


You can find the KEYS file here:


* https://dist.apache.org/repos/dist/dev/iceberg/KEYS


Convenience binary artifacts are staged on pypi:


https://pypi.org/project/pyiceberg/0.2.1rc0/


And can be installed using: pip3 install pyiceberg==0.2.1rc0


Please download, verify, and test.


Please vote in the next 72 hours.

[ ] +1 Release this as PyIceberg 0.2.1

[ ] +0

[ ] -1 Do not release this because...


Kind regards,

Fokko Driesprong


Re: [VOTE] Release Apache PyIceberg 0.2.1

2023-01-02 Thread Driesprong, Fokko
Hey everyone!

First of all, happy new year! Best wishes for 2023 to you and your family.
Just a gentle reminder to see if can release this RC to fix some critical
bugs in 0.2.0.

Thanks!

Fokko Driesprong

Op di 27 dec. 2022 om 09:17 schreef Driesprong, Fokko :

> Hi Everyone,
>
>
> First of all, I hope y'all had a great Christmas!
>
>
> Last week we fixed <https://github.com/apache/iceberg/pull/6478> an issue
> <https://github.com/apache/iceberg/issues/6469> in PyIceberg that breaks
> filtering (and partitioning) on date fields. As discussed in the Iceberg
> sync, I propose that we release the following RC as the official
> PyIceberg 0.2.1 release.
>
>
> This adds the following commits on top of 0.2.0:
>
>
>- Python: Read date as an int #6487
><https://github.com/apache/iceberg/pull/6487>
>- Python: Bump version to 0.2.1 #6483
><https://github.com/apache/iceberg/pull/6483>
>- Python: Fix reading UUIDs #6486
><https://github.com/apache/iceberg/pull/6486>
>- Python: Fix PyArrow import #6484
><https://github.com/apache/iceberg/pull/6484>
>
> Instructions on how to verify the release can be found here
> <https://py.iceberg.apache.org/verify-release/>.
>
>
> The commit ID is d48b60e71232fe5c27ca4f4238adeefa1faa9ddf
>
>
> * This corresponds to the tag: pyiceberg-0.2.1rc0
> (072c7b6806f13ea314e1293d71fc9243b6b4be50)
>
> * https://github.com/apache/iceberg/releases/tag/pyiceberg-0.2.1rc0
>
> *
> https://github.com/apache/iceberg/tree/d48b60e71232fe5c27ca4f4238adeefa1faa9ddf
>
>
> The release tarball, signature, and checksums are here:
>
>
> * https://dist.apache.org/repos/dist/dev/iceberg/pyiceberg-0.2.1rc0/
>
>
> You can find the KEYS file here:
>
>
> * https://dist.apache.org/repos/dist/dev/iceberg/KEYS
>
>
> Convenience binary artifacts are staged on pypi:
>
>
> https://pypi.org/project/pyiceberg/0.2.1rc0/
>
>
> And can be installed using: pip3 install pyiceberg==0.2.1rc0
>
>
> Please download, verify, and test.
>
>
> Please vote in the next 72 hours.
>
> [ ] +1 Release this as PyIceberg 0.2.1
>
> [ ] +0
>
> [ ] -1 Do not release this because...
>
>
> Kind regards,
>
> Fokko Driesprong
>


Re: [VOTE] Release Apache PyIceberg 0.2.1

2023-01-06 Thread Driesprong, Fokko
Hi Ryan, Daniel,

Thanks for giving it a try and for the voting.

 Tests are passing, although in a new environment I get the error about not
> being able to load pyparsing. It would be nice to fix that, but since it
> isn't a regression, I think this release should not be blocked by it.
>

This was fixed in #6439 <https://github.com/apache/iceberg/pull/6439> and
should probably have been backported to 0.2.1 as well.

 - tests fail if you have a ~/.pyiceberg.yaml configuration (we should
> probably fix this, but is not a blocker)


This was fixed in #6445 <https://github.com/apache/iceberg/pull/6445>, and
I've also created a PR <https://github.com/apache/iceberg/pull/6521> to add
the suggestion of using a fresh docker container.

Is there one more PMC around for a vote? :)

Cheers, Fokko



Op di 3 jan. 2023 om 01:33 schreef Daniel Weeks :

> +1 (binding)
>
> Verified:
>  - sigs/sums/license/tests
>  - removal of hard dependency on pyarrow
>
> I don't believe the following should block this release unless they were
> introduced in this patch version, but I ran into the following:
>  - pyparsing was not installed during 'make install' and tests failed
>  - tests fail if you have a ~/.pyiceberg.yaml configuration (we should
> probably fix this, but is not a blocker)
>
> -Dan
>
> On Mon, Jan 2, 2023 at 12:22 AM Driesprong, Fokko 
> wrote:
>
>> Hey everyone!
>>
>> First of all, happy new year! Best wishes for 2023 to you and your
>> family. Just a gentle reminder to see if can release this RC to fix some
>> critical bugs in 0.2.0.
>>
>> Thanks!
>>
>> Fokko Driesprong
>>
>> Op di 27 dec. 2022 om 09:17 schreef Driesprong, Fokko :
>>
>>> Hi Everyone,
>>>
>>>
>>> First of all, I hope y'all had a great Christmas!
>>>
>>>
>>> Last week we fixed <https://github.com/apache/iceberg/pull/6478> an
>>> issue <https://github.com/apache/iceberg/issues/6469> in PyIceberg that
>>> breaks filtering (and partitioning) on date fields. As discussed in the
>>> Iceberg sync, I propose that we release the following RC as the
>>> official PyIceberg 0.2.1 release.
>>>
>>>
>>> This adds the following commits on top of 0.2.0:
>>>
>>>
>>>- Python: Read date as an int #6487
>>><https://github.com/apache/iceberg/pull/6487>
>>>- Python: Bump version to 0.2.1 #6483
>>><https://github.com/apache/iceberg/pull/6483>
>>>- Python: Fix reading UUIDs #6486
>>><https://github.com/apache/iceberg/pull/6486>
>>>- Python: Fix PyArrow import #6484
>>><https://github.com/apache/iceberg/pull/6484>
>>>
>>> Instructions on how to verify the release can be found here
>>> <https://py.iceberg.apache.org/verify-release/>.
>>>
>>>
>>> The commit ID is d48b60e71232fe5c27ca4f4238adeefa1faa9ddf
>>>
>>>
>>> * This corresponds to the tag: pyiceberg-0.2.1rc0
>>> (072c7b6806f13ea314e1293d71fc9243b6b4be50)
>>>
>>> * https://github.com/apache/iceberg/releases/tag/pyiceberg-0.2.1rc0
>>>
>>> *
>>> https://github.com/apache/iceberg/tree/d48b60e71232fe5c27ca4f4238adeefa1faa9ddf
>>>
>>>
>>> The release tarball, signature, and checksums are here:
>>>
>>>
>>> * https://dist.apache.org/repos/dist/dev/iceberg/pyiceberg-0.2.1rc0/
>>>
>>>
>>> You can find the KEYS file here:
>>>
>>>
>>> * https://dist.apache.org/repos/dist/dev/iceberg/KEYS
>>>
>>>
>>> Convenience binary artifacts are staged on pypi:
>>>
>>>
>>> https://pypi.org/project/pyiceberg/0.2.1rc0/
>>>
>>>
>>> And can be installed using: pip3 install pyiceberg==0.2.1rc0
>>>
>>>
>>> Please download, verify, and test.
>>>
>>>
>>> Please vote in the next 72 hours.
>>>
>>> [ ] +1 Release this as PyIceberg 0.2.1
>>>
>>> [ ] +0
>>>
>>> [ ] -1 Do not release this because...
>>>
>>>
>>> Kind regards,
>>>
>>> Fokko Driesprong
>>>
>>


Re: [VOTE] Release Apache PyIceberg 0.2.1

2023-01-06 Thread Driesprong, Fokko
Thanks for the vote, and I'm aware that it is still the holiday season, so
no apologies are needed for the delay.

Kind regards,
Fokko Driesprong


Op vr 6 jan. 2023 om 19:02 schreef Jack Ye :

> +1 (binding)
>
> Just came back from vacation and am trying to catch up, sorry for the
> delay of vote.
>
> - Verified checksum, license, signature
> - Verified tests passing
> - Verified Glue catalog related operations working
> - was able to reproduce the 2 issues that are identified, but those are
> non blocking as discussed.
>
> Best,
> Jack Ye
>
> On Fri, Jan 6, 2023 at 9:46 AM Driesprong, Fokko  wrote:
>
>> Hi Ryan, Daniel,
>>
>> Thanks for giving it a try and for the voting.
>>
>>  Tests are passing, although in a new environment I get the error about
>>> not being able to load pyparsing. It would be nice to fix that, but since
>>> it isn't a regression, I think this release should not be blocked by it.
>>>
>>
>> This was fixed in #6439 <https://github.com/apache/iceberg/pull/6439> and
>> should probably have been backported to 0.2.1 as well.
>>
>>  - tests fail if you have a ~/.pyiceberg.yaml configuration (we should
>>> probably fix this, but is not a blocker)
>>
>>
>> This was fixed in #6445 <https://github.com/apache/iceberg/pull/6445>, and
>> I've also created a PR <https://github.com/apache/iceberg/pull/6521> to
>> add the suggestion of using a fresh docker container.
>>
>> Is there one more PMC around for a vote? :)
>>
>> Cheers, Fokko
>>
>>
>>
>> Op di 3 jan. 2023 om 01:33 schreef Daniel Weeks > >:
>>
>>> +1 (binding)
>>>
>>> Verified:
>>>  - sigs/sums/license/tests
>>>  - removal of hard dependency on pyarrow
>>>
>>> I don't believe the following should block this release unless they were
>>> introduced in this patch version, but I ran into the following:
>>>  - pyparsing was not installed during 'make install' and tests failed
>>>  - tests fail if you have a ~/.pyiceberg.yaml configuration (we should
>>> probably fix this, but is not a blocker)
>>>
>>> -Dan
>>>
>>> On Mon, Jan 2, 2023 at 12:22 AM Driesprong, Fokko 
>>> wrote:
>>>
>>>> Hey everyone!
>>>>
>>>> First of all, happy new year! Best wishes for 2023 to you and your
>>>> family. Just a gentle reminder to see if can release this RC to fix some
>>>> critical bugs in 0.2.0.
>>>>
>>>> Thanks!
>>>>
>>>> Fokko Driesprong
>>>>
>>>> Op di 27 dec. 2022 om 09:17 schreef Driesprong, Fokko >>> >:
>>>>
>>>>> Hi Everyone,
>>>>>
>>>>>
>>>>> First of all, I hope y'all had a great Christmas!
>>>>>
>>>>>
>>>>> Last week we fixed <https://github.com/apache/iceberg/pull/6478> an
>>>>> issue <https://github.com/apache/iceberg/issues/6469> in PyIceberg
>>>>> that breaks filtering (and partitioning) on date fields. As discussed in
>>>>> the Iceberg sync, I propose that we release the following RC as the
>>>>> official PyIceberg 0.2.1 release.
>>>>>
>>>>>
>>>>> This adds the following commits on top of 0.2.0:
>>>>>
>>>>>
>>>>>- Python: Read date as an int #6487
>>>>><https://github.com/apache/iceberg/pull/6487>
>>>>>- Python: Bump version to 0.2.1 #6483
>>>>><https://github.com/apache/iceberg/pull/6483>
>>>>>- Python: Fix reading UUIDs #6486
>>>>><https://github.com/apache/iceberg/pull/6486>
>>>>>- Python: Fix PyArrow import #6484
>>>>><https://github.com/apache/iceberg/pull/6484>
>>>>>
>>>>> Instructions on how to verify the release can be found here
>>>>> <https://py.iceberg.apache.org/verify-release/>.
>>>>>
>>>>>
>>>>> The commit ID is d48b60e71232fe5c27ca4f4238adeefa1faa9ddf
>>>>>
>>>>>
>>>>> * This corresponds to the tag: pyiceberg-0.2.1rc0
>>>>> (072c7b6806f13ea314e1293d71fc9243b6b4be50)
>>>>>
>>>>> * https://github.com/apache/iceberg/releases/tag/pyiceberg-0.2.1rc0
>>>>>
>>>>> *
>>>>> https://github.com/apache/iceberg/tree/d48b60e71232fe5c27ca4f4238adeefa1faa9ddf
>>>>>
>>>>>
>>>>> The release tarball, signature, and checksums are here:
>>>>>
>>>>>
>>>>> * https://dist.apache.org/repos/dist/dev/iceberg/pyiceberg-0.2.1rc0/
>>>>>
>>>>>
>>>>> You can find the KEYS file here:
>>>>>
>>>>>
>>>>> * https://dist.apache.org/repos/dist/dev/iceberg/KEYS
>>>>>
>>>>>
>>>>> Convenience binary artifacts are staged on pypi:
>>>>>
>>>>>
>>>>> https://pypi.org/project/pyiceberg/0.2.1rc0/
>>>>>
>>>>>
>>>>> And can be installed using: pip3 install pyiceberg==0.2.1rc0
>>>>>
>>>>>
>>>>> Please download, verify, and test.
>>>>>
>>>>>
>>>>> Please vote in the next 72 hours.
>>>>>
>>>>> [ ] +1 Release this as PyIceberg 0.2.1
>>>>>
>>>>> [ ] +0
>>>>>
>>>>> [ ] -1 Do not release this because...
>>>>>
>>>>>
>>>>> Kind regards,
>>>>>
>>>>> Fokko Driesprong
>>>>>
>>>>


[ANNOUNCE] Apache PyIceberg release 0.2.1

2023-01-06 Thread Driesprong, Fokko
I'm pleased to announce the release of Apache PyIceberg 0.2.1!

Apache Iceberg is an open table format for huge analytic datasets.
Iceberg delivers high query performance for tables with tens of petabytes
of data, along with atomic commits, concurrent writes, and SQL-compatible
table evolution.

This adds the following commits on top of 0.2.0 as a patch release:

   - Python: Read date as an int #6487
   
   - Python: Bump version to 0.2.1 #6483
   
   - Python: Fix reading UUIDs #6486
   
   - Python: Fix PyArrow import #6484
   

This Python release can be downloaded from:
https://pypi.org/project/pyiceberg/0.2.1/

Thanks to everyone for contributing and looking forward to 0.3.0
! Feel free to reach out if
you want to include anything in the next release.

Kind regards,
Fokko


[VOTE] Release Apache PyIceberg 0.3.0

2023-01-25 Thread Driesprong, Fokko
Hi Everyone,


I propose that we release the following RC as the official PyIceberg 0.3.0
release.


The commit ID is 2671621565cde8adda27b81d1699663f71d9b3d4


* This corresponds to the tag: pyiceberg-0.3.0rc1
(cf941fe6ae30fbfe98235d3799448cb9f717e1e6)

* https://github.com/apache/iceberg/releases/tag/pyiceberg-0.3.0rc1

*
https://github.com/apache/iceberg/tree/2671621565cde8adda27b81d1699663f71d9b3d4


This release has support for ID-based projections, to correctly handle
renames and promotions, performance improvement by loading the manifests in
parallel, and it also contains a lot of important bug fixes.


The release tarball, signature, and checksums are here:


* https://dist.apache.org/repos/dist/dev/iceberg/pyiceberg-0.3.0rc1/


You can find the KEYS file here:


* https://dist.apache.org/repos/dist/dev/iceberg/KEYS


Convenience binary artifacts are staged on pypi:


https://pypi.org/project/pyiceberg/0.3.0rc1/


And can be installed using: pip3 install pyiceberg==0.3.0rc1


Instructions on how to verify the release can be found on the docs page:

https://py.iceberg.apache.org/verify-release/


If there is anything, please don't hesitate to reach out.


Please download, verify, and test.


Please vote in the next 72 hours.

[ ] +1 Release this as PyIceberg 0.3.0

[ ] +0

[ ] -1 Do not release this because...


Kind regards,

Fokko


Re: [VOTE] Release Apache PyIceberg 0.3.0

2023-01-29 Thread Driesprong, Fokko
Thanks for running the checks Daniel, and I can confirm that those files
are missing. I think it makes sense to add them. Amogh, do you want to
create a PR for this? I think it would be best to just include the dev
folder, to make sure that we don't forget to include future files.

I’ll dig into the warning thrown by the test.

Kind regards,
Fokko


Op ma 30 jan. 2023 om 02:57 schreef Jahagirdar, Amogh
:

> If I’m not mistaken, I think the issue is that the release tarball is
> missing the rat-excludes file and the relevant scripts from the dev folder.
> I ran the tests by checking out the source at the release candidate tag
> which has all the relevant files.
>
>
>
> I think we should include these files in the release tarball, which would
> require another RC because of the new signature and checksum. I was looking
> at the 0.2.0 release tarball, and it looks like these files also weren’t in
> there so it seems intentional to exclude these files?
>
>
>
> Would like to get the community’s thoughts on this!
>
>
>
> Thanks,
>
>
>
> Amogh Jahagirdar
>
>
>
> *From: *Daniel Weeks 
> *Reply-To: *"dev@iceberg.apache.org" 
> *Date: *Sunday, January 29, 2023 at 9:41 AM
> *To: *"dev@iceberg.apache.org" 
> *Subject: *RE: [EXTERNAL][VOTE] Release Apache PyIceberg 0.3.0
>
>
>
> *CAUTION*: This email originated from outside of the organization. Do not
> click links or open attachments unless you can confirm the sender and know
> the content is safe.
>
>
>
>
>
>
>
> +0
>
>
>
> Verified sigs and sums, but ran into the following issues running through
> the verification steps:
>
>
>
> *Ran into the following error verifying the licenses.  Touching the file
> fixed the issue.*
>
> $ ./dev/check-license
> Attempting to fetch rat
> Exception in thread "main" java.io.FileNotFoundException:
> /Users/dweeks/workspace/apache/releases/pyiceberg/0.3.0-rc1/pyiceberg-0.3.0/dev/.rat-excludes
> (No such file or directory)
> at java.base/java.io.FileInputStream.open0(Native Method)
> at java.base/java.io.FileInputStream.open(FileInputStream.java:216)
> at java.base/java.io.FileInputStream.(FileInputStream.java:157)
> at org.apache.commons.io.FileUtils.openInputStream(FileUtils.java:2388)
> at org.apache.commons.io.FileUtils.readLines(FileUtils.java:2561)
> at org.apache.rat.Report.main(Report.java:89)
> RAT exited abnormally
>
>
>
> *One warning when running tests:*
>
> ==
> warnings summary
> ==
> tests/test_transforms.py:423
>
> /Users/dweeks/workspace/apache/releases/pyiceberg/0.3.0-rc1/pyiceberg-0.3.0/tests/test_transforms.py:423:
> PytestCollectionWarning: cannot collect test class 'TestType' because it
> has a __init__ constructor (from: tests/test_transforms.py)
> class TestType(IcebergBaseModel):
>
> -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
>
>
>
> *Verification step `make test-s3` failed with the following:*
>
> $ make test-s3
> sh ./dev/run-minio.sh
> sh: ./dev/run-minio.sh: No such file or directory
> make: *** [test-s3] Error 127
>
>
>
> On Wed, Jan 25, 2023 at 6:23 AM Driesprong, Fokko 
> wrote:
>
> Hi Everyone,
>
>
>
> I propose that we release the following RC as the official PyIceberg 0.3.0
> release.
>
>
>
> The commit ID is 2671621565cde8adda27b81d1699663f71d9b3d4
>
>
>
> * This corresponds to the tag: pyiceberg-0.3.0rc1
> (cf941fe6ae30fbfe98235d3799448cb9f717e1e6)
>
> * https://github.com/apache/iceberg/releases/tag/pyiceberg-0.3.0rc1
>
> *
> https://github.com/apache/iceberg/tree/2671621565cde8adda27b81d1699663f71d9b3d4
>
>
>
> This release has support for ID-based projections, to correctly handle
> renames and promotions, performance improvement by loading the manifests in
> parallel, and it also contains a lot of important bug fixes.
>
>
>
> The release tarball, signature, and checksums are here:
>
>
>
> * https://dist.apache.org/repos/dist/dev/iceberg/pyiceberg-0.3.0rc1/
>
>
>
> You can find the KEYS file here:
>
>
>
> * https://dist.apache.org/repos/dist/dev/iceberg/KEYS
>
>
>
> Convenience binary artifacts are staged on pypi:
>
>
>
> https://pypi.org/project/pyiceberg/0.3.0rc1/
>
>
>
> And can be installed using: pip3 install pyiceberg==0.3.0rc1
>
>
>
> Instructions on how to verify the release can be found on the docs page:
>
> https://py.iceberg.apache.org/verify-release/
>
>
>
> If there is anything, please don't hesitate to reach out.
>
>
>
> Please download, verify, and test.
>
>
>
> Please vote in the next 72 hours.
>
> [ ] +1 Release this as PyIceberg 0.3.0
>
> [ ] +0
>
> [ ] -1 Do not release this because...
>
>
>
> Kind regards,
>
> Fokko
>
>


Re: [VOTE] Release Apache PyIceberg 0.3.0

2023-01-30 Thread Driesprong, Fokko
Perfect, and I agree that a branch is a great idea. There is some work on
the current master branch that needs a bit more testing before releasing
that to the public. I've created a branch called pyiceberg-0.3.x
<https://github.com/apache/iceberg/tree/pyiceberg-0.3.x>.

Kind regards,
Fokko

Op ma 30 jan. 2023 om 08:14 schreef Jahagirdar, Amogh
:

> Sure, happy to raise a PR! I think we’ll need to create a branch off of
> the 0.3.0 commit to update for this release, and then a separate PR with
> the change against master as well for future releases. Let me know if this
> is what you had in mind.
>
>
>
> Thanks,
>
>
>
> Amogh Jahagirdar
>
>
>
> *From: *"Driesprong, Fokko" 
> *Reply-To: *"dev@iceberg.apache.org" 
> *Date: *Sunday, January 29, 2023 at 10:19 PM
> *To: *"dev@iceberg.apache.org" 
> *Subject: *RE: [EXTERNAL][VOTE] Release Apache PyIceberg 0.3.0
>
>
>
> *CAUTION*: This email originated from outside of the organization. Do not
> click links or open attachments unless you can confirm the sender and know
> the content is safe.
>
>
>
> Thanks for running the checks Daniel, and I can confirm that those files
> are missing. I think it makes sense to add them. Amogh, do you want to
> create a PR for this? I think it would be best to just include the dev
> folder, to make sure that we don't forget to include future files.
>
>
>
> I’ll dig into the warning thrown by the test.
>
>
>
> Kind regards,
>
> Fokko
>
>
>
>
>
> Op ma 30 jan. 2023 om 02:57 schreef Jahagirdar, Amogh
> :
>
> If I’m not mistaken, I think the issue is that the release tarball is
> missing the rat-excludes file and the relevant scripts from the dev folder.
> I ran the tests by checking out the source at the release candidate tag
> which has all the relevant files.
>
>
>
> I think we should include these files in the release tarball, which would
> require another RC because of the new signature and checksum. I was looking
> at the 0.2.0 release tarball, and it looks like these files also weren’t in
> there so it seems intentional to exclude these files?
>
>
>
> Would like to get the community’s thoughts on this!
>
>
>
> Thanks,
>
>
>
> Amogh Jahagirdar
>
>
>
> *From: *Daniel Weeks 
> *Reply-To: *"dev@iceberg.apache.org" 
> *Date: *Sunday, January 29, 2023 at 9:41 AM
> *To: *"dev@iceberg.apache.org" 
> *Subject: *RE: [EXTERNAL][VOTE] Release Apache PyIceberg 0.3.0
>
>
>
> *CAUTION*: This email originated from outside of the organization. Do not
> click links or open attachments unless you can confirm the sender and know
> the content is safe.
>
>
>
>
>
>
>
> +0
>
>
>
> Verified sigs and sums, but ran into the following issues running through
> the verification steps:
>
>
>
> *Ran into the following error verifying the licenses.  Touching the file
> fixed the issue.*
>
> $ ./dev/check-license
> Attempting to fetch rat
> Exception in thread "main" java.io.FileNotFoundException:
> /Users/dweeks/workspace/apache/releases/pyiceberg/0.3.0-rc1/pyiceberg-0.3.0/dev/.rat-excludes
> (No such file or directory)
> at java.base/java.io.FileInputStream.open0(Native Method)
> at java.base/java.io.FileInputStream.open(FileInputStream.java:216)
> at java.base/java.io.FileInputStream.(FileInputStream.java:157)
> at org.apache.commons.io.FileUtils.openInputStream(FileUtils.java:2388)
> at org.apache.commons.io.FileUtils.readLines(FileUtils.java:2561)
> at org.apache.rat.Report.main(Report.java:89)
> RAT exited abnormally
>
>
>
> *One warning when running tests:*
>
> ==
> warnings summary
> ==
> tests/test_transforms.py:423
>
> /Users/dweeks/workspace/apache/releases/pyiceberg/0.3.0-rc1/pyiceberg-0.3.0/tests/test_transforms.py:423:
> PytestCollectionWarning: cannot collect test class 'TestType' because it
> has a __init__ constructor (from: tests/test_transforms.py)
> class TestType(IcebergBaseModel):
>
> -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
>
>
>
> *Verification step `make test-s3` failed with the following:*
>
> $ make test-s3
> sh ./dev/run-minio.sh
> sh: ./dev/run-minio.sh: No such file or directory
> make: *** [test-s3] Error 127
>
>
>
> On Wed, Jan 25, 2023 at 6:23 AM Driesprong, Fokko 
> wrote:
>
> Hi Everyone,
>
>
>
> I propose that we release the following RC a

Re: [VOTE] Release Apache PyIceberg 0.3.0

2023-01-30 Thread Driesprong, Fokko
I've created a fix for the warning for the 0.3.0 release:
https://github.com/apache/iceberg/pull/6702
Also, created a PR to fix this on master, and also turned warnings into
errors: https://github.com/apache/iceberg/pull/6703/
<https://github.com/apache/iceberg/pull/6703/>

Kind regards,
Fokko

Op ma 30 jan. 2023 om 15:05 schreef Driesprong, Fokko :

> Perfect, and I agree that a branch is a great idea. There is some work on
> the current master branch that needs a bit more testing before releasing
> that to the public. I've created a branch called pyiceberg-0.3.x
> <https://github.com/apache/iceberg/tree/pyiceberg-0.3.x>.
>
> Kind regards,
> Fokko
>
> Op ma 30 jan. 2023 om 08:14 schreef Jahagirdar, Amogh
> :
>
>> Sure, happy to raise a PR! I think we’ll need to create a branch off of
>> the 0.3.0 commit to update for this release, and then a separate PR with
>> the change against master as well for future releases. Let me know if this
>> is what you had in mind.
>>
>>
>>
>> Thanks,
>>
>>
>>
>> Amogh Jahagirdar
>>
>>
>>
>> *From: *"Driesprong, Fokko" 
>> *Reply-To: *"dev@iceberg.apache.org" 
>> *Date: *Sunday, January 29, 2023 at 10:19 PM
>> *To: *"dev@iceberg.apache.org" 
>> *Subject: *RE: [EXTERNAL][VOTE] Release Apache PyIceberg 0.3.0
>>
>>
>>
>> *CAUTION*: This email originated from outside of the organization. Do
>> not click links or open attachments unless you can confirm the sender and
>> know the content is safe.
>>
>>
>>
>> Thanks for running the checks Daniel, and I can confirm that those files
>> are missing. I think it makes sense to add them. Amogh, do you want to
>> create a PR for this? I think it would be best to just include the dev
>> folder, to make sure that we don't forget to include future files.
>>
>>
>>
>> I’ll dig into the warning thrown by the test.
>>
>>
>>
>> Kind regards,
>>
>> Fokko
>>
>>
>>
>>
>>
>> Op ma 30 jan. 2023 om 02:57 schreef Jahagirdar, Amogh
>> :
>>
>> If I’m not mistaken, I think the issue is that the release tarball is
>> missing the rat-excludes file and the relevant scripts from the dev folder.
>> I ran the tests by checking out the source at the release candidate tag
>> which has all the relevant files.
>>
>>
>>
>> I think we should include these files in the release tarball, which would
>> require another RC because of the new signature and checksum. I was looking
>> at the 0.2.0 release tarball, and it looks like these files also weren’t in
>> there so it seems intentional to exclude these files?
>>
>>
>>
>> Would like to get the community’s thoughts on this!
>>
>>
>>
>> Thanks,
>>
>>
>>
>> Amogh Jahagirdar
>>
>>
>>
>> *From: *Daniel Weeks 
>> *Reply-To: *"dev@iceberg.apache.org" 
>> *Date: *Sunday, January 29, 2023 at 9:41 AM
>> *To: *"dev@iceberg.apache.org" 
>> *Subject: *RE: [EXTERNAL][VOTE] Release Apache PyIceberg 0.3.0
>>
>>
>>
>> *CAUTION*: This email originated from outside of the organization. Do
>> not click links or open attachments unless you can confirm the sender and
>> know the content is safe.
>>
>>
>>
>>
>>
>>
>>
>> +0
>>
>>
>>
>> Verified sigs and sums, but ran into the following issues running through
>> the verification steps:
>>
>>
>>
>> *Ran into the following error verifying the licenses.  Touching the file
>> fixed the issue.*
>>
>> $ ./dev/check-license
>> Attempting to fetch rat
>> Exception in thread "main" java.io.FileNotFoundException:
>> /Users/dweeks/workspace/apache/releases/pyiceberg/0.3.0-rc1/pyiceberg-0.3.0/dev/.rat-excludes
>> (No such file or directory)
>> at java.base/java.io.FileInputStream.open0(Native Method)
>> at java.base/java.io.FileInputStream.open(FileInputStream.java:216)
>> at java.base/java.io.FileInputStream.(FileInputStream.java:157)
>> at org.apache.commons.io.FileUtils.openInputStream(FileUtils.java:2388)
>> at org.apache.commons.io.FileUtils.readLines(FileUtils.java:2561)
>> at org.apache.rat.Report.main(Report.java:89)
>> RAT exited abnormally
>>
>>
>>
>> *One warning when running tests:*
>>
>> ==========
>> warnings summary
>> =

Re: [VOTE] Release Apache PyIceberg 0.3.0

2023-02-02 Thread Driesprong, Fokko
Thanks Amogh for fixing this.

Let me run another RC.

Op ma 30 jan. 2023 om 08:35 schreef Jahagirdar, Amogh
:

> Awesome, I’ll take a look Fokko.
>
> Here’s the PR against the 0.3.0 branch for updating the build to include
> the dev folder: https://github.com/apache/iceberg/pull/6704
>
> PR against the master branch: https://github.com/apache/iceberg/pull/6705
>
>
>
> Thanks,
>
>
>
> Amogh Jahagirdar
>
>
>
> *From: *"Driesprong, Fokko" 
> *Reply-To: *"dev@iceberg.apache.org" 
> *Date: *Monday, January 30, 2023 at 7:19 AM
> *To: *"dev@iceberg.apache.org" 
> *Subject: *RE: [EXTERNAL][VOTE] Release Apache PyIceberg 0.3.0
>
>
>
> *CAUTION*: This email originated from outside of the organization. Do not
> click links or open attachments unless you can confirm the sender and know
> the content is safe.
>
>
>
> I've created a fix for the warning for the 0.3.0 release:
> https://github.com/apache/iceberg/pull/6702
>
> Also, created a PR to fix this on master, and also turned warnings into
> errors: https://github.com/apache/iceberg/pull/6703/
> <https://github.com/apache/iceberg/pull/6703/>
>
>
>
> Kind regards,
>
> Fokko
>
>
>
> Op ma 30 jan. 2023 om 15:05 schreef Driesprong, Fokko  >:
>
> Perfect, and I agree that a branch is a great idea. There is some work on
> the current master branch that needs a bit more testing before releasing
> that to the public. I've created a branch called pyiceberg-0.3.x
> <https://github.com/apache/iceberg/tree/pyiceberg-0.3.x>.
>
>
>
> Kind regards,
>
> Fokko
>
>
>
> Op ma 30 jan. 2023 om 08:14 schreef Jahagirdar, Amogh
> :
>
> Sure, happy to raise a PR! I think we’ll need to create a branch off of
> the 0.3.0 commit to update for this release, and then a separate PR with
> the change against master as well for future releases. Let me know if this
> is what you had in mind.
>
>
>
> Thanks,
>
>
>
> Amogh Jahagirdar
>
>
>
> *From: *"Driesprong, Fokko" 
> *Reply-To: *"dev@iceberg.apache.org" 
> *Date: *Sunday, January 29, 2023 at 10:19 PM
> *To: *"dev@iceberg.apache.org" 
> *Subject: *RE: [EXTERNAL][VOTE] Release Apache PyIceberg 0.3.0
>
>
>
> *CAUTION*: This email originated from outside of the organization. Do not
> click links or open attachments unless you can confirm the sender and know
> the content is safe.
>
>
>
> Thanks for running the checks Daniel, and I can confirm that those files
> are missing. I think it makes sense to add them. Amogh, do you want to
> create a PR for this? I think it would be best to just include the dev
> folder, to make sure that we don't forget to include future files.
>
>
>
> I’ll dig into the warning thrown by the test.
>
>
>
> Kind regards,
>
> Fokko
>
>
>
>
>
> Op ma 30 jan. 2023 om 02:57 schreef Jahagirdar, Amogh
> :
>
> If I’m not mistaken, I think the issue is that the release tarball is
> missing the rat-excludes file and the relevant scripts from the dev folder.
> I ran the tests by checking out the source at the release candidate tag
> which has all the relevant files.
>
>
>
> I think we should include these files in the release tarball, which would
> require another RC because of the new signature and checksum. I was looking
> at the 0.2.0 release tarball, and it looks like these files also weren’t in
> there so it seems intentional to exclude these files?
>
>
>
> Would like to get the community’s thoughts on this!
>
>
>
> Thanks,
>
>
>
> Amogh Jahagirdar
>
>
>
> *From: *Daniel Weeks 
> *Reply-To: *"dev@iceberg.apache.org" 
> *Date: *Sunday, January 29, 2023 at 9:41 AM
> *To: *"dev@iceberg.apache.org" 
> *Subject: *RE: [EXTERNAL][VOTE] Release Apache PyIceberg 0.3.0
>
>
>
> *CAUTION*: This email originated from outside of the organization. Do not
> click links or open attachments unless you can confirm the sender and know
> the content is safe.
>
>
>
>
>
>
>
> +0
>
>
>
> Verified sigs and sums, but ran into the following issues running through
> the verification steps:
>
>
>
> *Ran into the following error verifying the licenses.  Touching the file
> fixed the issue.*
>
> $ ./dev/check-license
> Attempting to fetch rat
> Exception in thread "main" java.io.FileNotFoundException:
> /Users/dweeks/workspace/apache/releases/pyiceberg/0.3.0-rc1/pyiceberg-0.3.0/dev/.rat-excludes
> (No such file or directory)
> at java.base/java.io.FileInputStream.open0(Native Method)
> at java.base/java.io.FileInputS

[VOTE] Release Apache PyIceberg 0.3.0

2023-02-02 Thread Driesprong, Fokko
Hi Everyone,

I propose that we release the following RC as the official PyIceberg 0.3.0
release. This RC ships the fixes to run the tests locally, and fixes the
warning when running the test.

Ran the tests from the tar and I can confirm that it is working. Please
consider this my +1 (non-binding).

The commit ID is cb572ac94433710b62b8e8049075bf41faa77119

* This corresponds to the tag: pyiceberg-0.3.0rc2
(3b353228c4be097f975ad75ba40b45eeb255f65e)
* https://github.com/apache/iceberg/releases/tag/pyiceberg-0.3.0rc2
*
https://github.com/apache/iceberg/tree/cb572ac94433710b62b8e8049075bf41faa77119

The release tarball, signature, and checksums are here:

* https://dist.apache.org/repos/dist/dev/iceberg/pyiceberg-0.3.0rc2/

You can find the KEYS file here:

* https://dist.apache.org/repos/dist/dev/iceberg/KEYS

Convenience binary artifacts are staged on pypi:

https://pypi.org/project/pyiceberg/0.3.0rc2/

And can be installed using: pip3 install pyiceberg==0.3.0rc2

Please download, verify, and test.

Please vote in the next 72 hours.
[ ] +1 Release this as PyIceberg 0.3.0
[ ] +0
[ ] -1 Do not release this because...

Kind regards,
Fokko Driesprong


Re: [VOTE] Release Apache PyIceberg 0.3.0

2023-02-09 Thread Driesprong, Fokko
Thanks everyone for voting. With 3 binding votes the vote passes and we can
release PyIceberg 0.3.0. Many thanks everyone for testing the release and
contributing to the code!

Votes +1:
Amogh Jahagirdar (non-binding)
Prashant Singh (non-binding)
Jack Ye (binding)
Daniel Weeks (binding)
Eduard Tudenhoefner (non-binding)
Ryan Blue (binding)
Fokko Driesprong (non-binding)

Votes 0: Ø
Votes -1: Ø

I'll publish the release on PyPi shortly.

Kind regards,
Fokko Driesprong


Op wo 8 feb. 2023 om 19:47 schreef Ryan Blue :

> +1 (binding)
>
>- Validated checksum and signature
>- Ran RAT check
>- Built and tested in Python 3.10.5
>- Installed the rc from pypi
>- Ran pyiceberg commands
>
> I did hit a slight issue, which was that when I didn’t specify the
> optional pyarrow dependency when I installed, the CLI failed with this
> error:
>
> [blue@work pyiceberg-0.3.0]$ pyiceberg --help
> Traceback (most recent call last):
>   File "/home/blue/.pyenv/versions/3.10.5/bin/pyiceberg", line 5, in 
> from pyiceberg.cli.console import run
>   File 
> "/home/blue/.pyenv/versions/3.10.5/lib/python3.10/site-packages/pyiceberg/cli/console.py",
>  line 30, in 
> from pyiceberg.catalog import Catalog, load_catalog
>   File 
> "/home/blue/.pyenv/versions/3.10.5/lib/python3.10/site-packages/pyiceberg/catalog/__init__.py",
>  line 39, in 
> from pyiceberg.table import Table
>   File 
> "/home/blue/.pyenv/versions/3.10.5/lib/python3.10/site-packages/pyiceberg/table/__init__.py",
>  line 47, in 
> from pyiceberg.io.pyarrow import project_table
>   File 
> "/home/blue/.pyenv/versions/3.10.5/lib/python3.10/site-packages/pyiceberg/io/pyarrow.py",
>  line 43, in 
> import pyarrow as pa
> ModuleNotFoundError: No module named 'pyarrow'
>
> Looks like we probably need to catch import errors. I don’t think it
> should fail the release, though.
>
> On Wed, Feb 8, 2023 at 8:44 AM Eduard Tudenhoefner 
> wrote:
>
>> +1 (non-binding)
>>
>> Verified signatures/checksums/license checks/tests and ran some manual
>> tests with the REST catalog.
>>
>> Thanks Fokko and others, this looks great.
>>
>> Eduard
>>
>>
>> On Tue, Feb 7, 2023 at 8:07 PM Daniel Weeks  wrote:
>>
>>> +1 (binding)
>>>
>>> Verified sig/sums/license/test (including s3)
>>>
>>> Ran through some manual tests using the REST Catalog and everything
>>> worked as expected.
>>>
>>> Looks great,
>>> -Dan
>>>
>>> On Sun, Feb 5, 2023 at 10:08 AM Jack Ye  wrote:
>>>
>>>> +1 (binding)
>>>>
>>>> Verified signature, checksum, RAT
>>>>
>>>> Ran unit and integration tests, plus some manual testing of Glue
>>>> catalog.
>>>>
>>>> Best,
>>>> Jack Ye
>>>>
>>>> On Fri, Feb 3, 2023 at 5:57 PM Prashant Singh 
>>>> wrote:
>>>>
>>>>> +1 (non-binding) Release this as PyIceberg 0.3.0
>>>>>
>>>>> Verified signatures, checksums, RAT checks. Ran the unit and
>>>>> integration tests as per https://py.iceberg.apache.org/verify-release/
>>>>>
>>>>>
>>>>> Regards,
>>>>>
>>>>> Prashant Singh
>>>>>
>>>>> On Fri, Feb 3, 2023 at 5:06 PM Jahagirdar, Amogh
>>>>>  wrote:
>>>>>
>>>>>> +1 non-binding for the PyIceberg 0.3.0 release. Verified signatures,
>>>>>> checksums, RAT checks. Ran the unit and integration tests.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>>
>>>>>>
>>>>>> Amogh Jahagirdar
>>>>>>
>>>>>>
>>>>>>
>>>>>> *From: *"Driesprong, Fokko" 
>>>>>> *Reply-To: *"dev@iceberg.apache.org" 
>>>>>> *Date: *Thursday, February 2, 2023 at 3:59 PM
>>>>>> *To: *"dev@iceberg.apache.org" 
>>>>>> *Subject: *[EXTERNAL] [VOTE] Release Apache PyIceberg 0.3.0
>>>>>>
>>>>>>
>>>>>>
>>>>>> *CAUTION*: This email originated from outside of the organization.
>>>>>> Do not click links or open attachments unless you can confirm the sender
>>>>>> and know the content is safe.
>>>>>>
>>>>>>
>>>

[ANNOUNCE] Apache PyIceberg release 0.3.0

2023-02-09 Thread Driesprong, Fokko
Hi there,

I'm pleased to announce the release of Apache PyIceberg 0.3.0!

Apache Iceberg is an open table format for huge analytic datasets. Iceberg
delivers high query performance for tables with tens of petabytes of data,
along with atomic commits, concurrent writes, and SQL-compatible table
evolution.

With the new release:
- Full projection by Iceberg Field ID

- Parallelization of job planning that speeds up the metadata operations by
an order of magnitude 
- Overhaul of reading Avro files without the use of Pydantic to improve
performance 
- Bugfix when reading a zero-length binary field

- Bugfix for PyArrow that led to multiple calls to the filesystem.


This Python release can be downloaded from:
https://pypi.org/project/pyiceberg/0.3.0/

Thanks to everyone for contributing!

Kind regards,
Fokko Driesprong


[DISCUSS] Removing python_legacy from the repo

2023-02-17 Thread Driesprong, Fokko
Hi all,

I'd want to discuss removing the python_legacy codebase from the
repository. I think we've reached the point where the new library has
feature parity with the legacy one, so I think that this would be a good
time to remove the legacy one. The main reason for me is to avoid confusion
for newcomers to the project. Let me know what you think, and if you have
any questions or concerns.

Kind regards,
Fokko Driesprong


Re: [DISCUSS] Removing python_legacy from the repo

2023-02-28 Thread Driesprong, Fokko
Hi everyone,

As discussed today at the Iceberg Python sync, I've created a PR for
removing python_legacy from the repo:
https://github.com/apache/iceberg/pull/6960

Again, I would like to thank all the authors working on this! It was a
great inspiration for PyIceberg.

Kind regards,
Fokko Driesprong



Op wo 22 feb 2023 om 04:36 schreef Jun H. :

> +1 for removing it. Thanks Fokko.
>
> Jun
>
> On Feb 21, 2023, at 7:47 AM, Daniel Weeks  wrote:
>
> 
> +1 for removal
>
> On Tue, Feb 21, 2023, 7:38 AM Jack Ye  wrote:
>
>> +1 for removing it!
>>
>> Thanks,
>> Jack Ye
>>
>> On Fri, Feb 17, 2023 at 4:18 PM Steve Zhang
>>  wrote:
>>
>>> Thank you Fokko and Ryan for your great work to reach feature parity.
>>> pyiceberg is the way to go!
>>>
>>> Thanks,
>>> Steve Zhang
>>>
>>>
>>>
>>> On Feb 17, 2023, at 8:29 AM, Ryan Blue  wrote:
>>>
>>> +1 for removing it. And it's great to see the new one reaching feature
>>> parity!
>>>
>>> On Fri, Feb 17, 2023 at 1:37 AM Driesprong, Fokko 
>>> wrote:
>>>
>>>> Hi all,
>>>>
>>>> I'd want to discuss removing the python_legacy codebase from the
>>>> repository. I think we've reached the point where the new library has
>>>> feature parity with the legacy one, so I think that this would be a good
>>>> time to remove the legacy one. The main reason for me is to avoid confusion
>>>> for newcomers to the project. Let me know what you think, and if you have
>>>> any questions or concerns.
>>>>
>>>> Kind regards,
>>>> Fokko Driesprong
>>>>
>>>
>>>
>>> --
>>> Ryan Blue
>>> Tabular
>>>
>>>
>>>


Re: [VOTE] Release Apache Iceberg 1.6.1 RC1

2024-08-22 Thread Driesprong, Fokko
It was not correctly backported, I do think we want to add this since it
fixes a CVE as mentioned earlier. I've created a PR:
https://github.com/apache/iceberg/pull/10988

Kind regards,
Fokko

Op do 22 aug 2024 om 11:35 schreef Jean-Baptiste Onofré :

> Hi guys,
>
> FYI, the reason I mentioned ORC update is because the PR is "flagged"
> with milestone 1.6.1.
> So it's a bit surprising to not have it in 1.6.1.
>
> We should at least update the PR/issue removing the 1.6.1 milestone,
> else it would not be "accurate".
>
> Regards
> JB
>
> On Thu, Aug 22, 2024 at 12:04 AM Piotr Findeisen
>  wrote:
> >
> > Hi Eduard,
> >
> > JB wrote
> >
> >> For the record (maybe it helps users/reviewers), this release includes:
> >> - ORC 1.9.4 update
> >> - introduce memory limit on ParallelIterable
> >
> >
> > I can confirm ParallelIterable change, but i am not sure whether ORC
> update was part of the release.
> >
> >
> > Best
> > Piotr
> >
> >
> >
> >
> > On Wed, 21 Aug 2024 at 09:45, Fokko Driesprong  wrote:
> >>
> >> Hey Eduard,
> >>
> >> I think it relates to this PR. It contains a CVE and would be good to
> be backported. We wanted to include it in 1.6.1 if we needed another RC,
> but that didn't happen, so I think we didn't cherry-pick it to 1.6.x branch.
> >>
> >> Kind regards,
> >> Fokko
> >>
> >> Op wo 21 aug 2024 om 09:34 schreef Eduard Tudenhöfner <
> etudenhoef...@apache.org>:
> >>>
> >>> @Piotr can you please elaborate which ORC update you are referring to?
> Or did you mean the Avro update (which I think we were planning for 1.6.2)?
> >>>
> >>> On Tue, Aug 20, 2024 at 7:05 PM Piotr Findeisen <
> piotr.findei...@gmail.com> wrote:
> 
>  Hi
> 
>  -1 (non-binding)
> 
>  I verified source tarball matches the git tag (except it lacks
> jitpack.yml, docs/ and 'examples/Convert table to Iceberg.ipynb').
>  However, i noted that source tarball verification is not part of
> https://iceberg.apache.org/how-to-release/#validating-a-source-release-candidate
> .
>  I started a separate dev list thread about this (
> https://lists.apache.org/thread/24c0xhfbb2680nrqyd2jrngxtg6qoz8c).
> 
>  as to the changes, it looks like it contains the ParallelIterable
> change, but I don't see ORC update
> 
>  $ git diff apache-iceberg-1.6.0..apache-iceberg-1.6.1-rc1  --numstat
>  167 55
> core/src/main/java/org/apache/iceberg/util/ParallelIterable.java
>  48  0
>  core/src/test/java/org/apache/iceberg/util/TestParallelIterable.java
> 
>  I tested with Trino https://github.com/trinodb/trino/pull/23083
>  The parallel change iterable caused a regression in Trino when
> planning queries with LIMIT.
>  Now the query scheduler will open more manifests than it used to
> (test io.trino.plugin.iceberg.TestIcebergFileOperations#testSelectWithLimit
> in Trino)
>  Reverting the change around queue low water mark [1][2] solved the
> test for me locally.
> 
>  Best,
>  Piotr
> 
>  [1] https://github.com/apache/iceberg/pull/10978
>  [2] https://github.com/apache/iceberg/pull/10979
> 
> 
> 
>  On Tue, 20 Aug 2024 at 15:31, Jean-Baptiste Onofré 
> wrote:
> >
> > +1 (non binding)
> >
> > I checked:
> > - download links are OK (both on dist and Maven Staging repo)
> > - build passed on the tag using JDK11, including the tests (I'm not
> > able to reproduce Renjie's issue)
> > - checksum and signature are good
> > - ASF header present in expected files
> > - no unexpected binary files found in the source distribution
> >
> > For the record (maybe it helps users/reviewers), this release
> includes:
> > - ORC 1.9.4 update
> > - introduce memory limit on ParallelIterable
> >
> > Regards
> > JB
> >
> >
> > On Tue, Aug 20, 2024 at 4:53 AM Carl Steinbach 
> wrote:
> > >
> > > Hi Everyone,
> > >
> > > I propose that we release the following RC as the official Apache
> Iceberg 1.6.1 release.
> > >
> > > The commit ID is e18a2fe10214f5f3ffa0a317a28af8b2a619817a
> > > * This corresponds to the tag: apache-iceberg-1.6.1-rc1
> > > *
> https://github.com/apache/iceberg/commits/apache-iceberg-1.6.1-rc1
> > > *
> https://github.com/apache/iceberg/tree/e18a2fe10214f5f3ffa0a317a28af8b2a619817a
> > >
> > > The release tarball, signature, and checksums are here:
> > > *
> https://dist.apache.org/repos/dist/dev/iceberg/apache-iceberg-1.6.1-rc1
> > >
> > > You can find the KEYS file here:
> > > * https://dist.apache.org/repos/dist/dev/iceberg/KEYS
> > >
> > > Convenience binary artifacts are staged on Nexus. The Maven
> repository URL is:
> > > *
> https://repository.apache.org/content/repositories/orgapacheiceberg-1170/
> > >
> > > Please download, verify, and test.
> > >
> > > Please vote in the next 72 hours.
> > >
> > > [ ] +1 Release this as Apache Iceberg 1.6.1
> > > [ ] +0

Re: [DISCUSS] PyIceberg 0.8.1 release

2024-11-26 Thread Driesprong, Fokko
Thanks Kevin for cleaning that up, and generating the nice overview.

I have another candidate PR for 0.8.1:
https://github.com/apache/iceberg-python/pull/1383

Kind regards,
Fokko

Op ma 25 nov 2024 om 18:41 schreef Kevin Liu :

> Hey folks,
>
> I started working on the 0.8.1 release, using the updated "how to release"
> docs (
> https://github.com/apache/iceberg-python/blob/main/mkdocs/docs/how-to-release.md
> )
> Here are the 9 commmits I propose to be included in this next release
> https://github.com/apache/iceberg-python/pull/1369
>
> Please let me know what you think.
>
> Best,
> Kevin Liu
>
>
> On Thu, Nov 21, 2024 at 10:05 AM Kevin Liu  wrote:
>
>> Thanks for starting this thread!
>>
>> Along with the 2 issues listed above, I propose this issue as well
>> * Ignore tables with missing table_type parameter in HMS and Glue (#1331
>> )
>>
>> Best,
>> Kevin Liu
>>
>> On Thu, Nov 21, 2024 at 5:18 AM Jean-Baptiste Onofré 
>> wrote:
>>
>>> Hi Fokko
>>>
>>> It makes sense to me.
>>>
>>> Regards
>>> JB
>>>
>>> On Thu, Nov 21, 2024 at 9:14 AM Fokko Driesprong 
>>> wrote:
>>> >
>>> > Hi everyone,
>>> >
>>> > I suggest following up on the PyIceberg 0.8.0 release with a patch
>>> release.
>>> >
>>> > Currently, we have two candidate bugfixes to be included:
>>> >
>>> > An issue where it falsely emits a warning when loading a table.
>>> > Another issue is when trying to add a parquet file to a table, that
>>> doesn't have column statistics for at least one column.
>>> >
>>> > Feel free to chime in on this thread if you want to include bug fixes
>>> in the 0.8.1 milestone.
>>> >
>>> > Kind regards,
>>> > Fokko
>>>
>>


Re: [VOTE] Release Apache Iceberg 1.7.1 RC1

2024-12-02 Thread Driesprong, Fokko
Hey Bryan,

Thanks for running the release! +1 binding from my end.

Checked signature and checksum, ran license checks, and built against JDK17.

Kind regards,
Fokko

Op di 3 dec 2024 om 03:16 schreef Bryan Keller :

> The 1.7.1 RC1 release is still up for a vote! The PR merged was to fix the
> release script to correct the KEYS link in the email template, for any
> potential future 1.7.x release.
>
> -Bryan
>
> On Dec 2, 2024, at 5:48 PM, Yufei Gu  wrote:
>
> Hi folks,
>
> Is the release blocked? I assume there would be a RC2, right?
> Yufei
>
>
> On Fri, Nov 22, 2024 at 9:02 AM Kevin Liu  wrote:
>
>> Thanks for adding that PR to the patch release! This should only affect
>> the release script, and only this once :).
>> I see that the documentation site has already been updated.
>> https://iceberg.apache.org/how-to-release/#setup
>>
>> Best,
>> Kevin Liu
>>
>> On Fri, Nov 22, 2024 at 6:36 AM Bryan Keller  wrote:
>>
>>> Apologies! I see Kevin updated this in the release script and docs
>>> already, in https://github.com/apache/iceberg/pull/11526. (The email
>>> template now points to https://downloads.apache.org/iceberg/KEYS.)
>>>
>>> Here's a PR to merge this to 1.7.x in case we have another patch
>>> release:
>>> https://github.com/apache/iceberg/pull/11629
>>>
>>> -Bryan
>>>
>>> On Nov 22, 2024, at 12:24 AM, Jean-Baptiste Onofré 
>>> wrote:
>>>
>>> Hi Yufei,
>>>
>>> As discussed on the dev mailing list (with Fokko), the KEYS file to
>>> use is: https://dist.apache.org/repos/dist/release/iceberg/KEYS
>>>
>>> Regards
>>> JB
>>>
>>> On Fri, Nov 22, 2024 at 6:36 AM Yufei Gu  wrote:
>>>
>>>
>>> Hi Bryan,
>>>
>>> This link seems broken,
>>> https://dist.apache.org/repos/dist/dev/iceberg/KEYS. Should we use
>>> another one, like the one in here
>>> https://downloads.apache.org/iceberg/KEYS?
>>>
>>> Yufei
>>>
>>>
>>> On Thu, Nov 21, 2024 at 2:36 PM Bryan Keller  wrote:
>>>
>>>
>>> Hi Everyone,
>>>
>>> I propose that we release the following RC as the official Apache
>>> Iceberg 1.7.1 release.
>>>
>>> The commit ID is 4a432839233f2343a9eae8255532f911f06358ef
>>> * This corresponds to the tag: apache-iceberg-1.7.1-rc1
>>> * https://github.com/apache/iceberg/commits/apache-iceberg-1.7.1-rc1
>>> *
>>> https://github.com/apache/iceberg/tree/4a432839233f2343a9eae8255532f911f06358ef
>>>
>>> The release tarball, signature, and checksums are here:
>>> *
>>> https://dist.apache.org/repos/dist/dev/iceberg/apache-iceberg-1.7.1-rc1
>>>
>>> You can find the KEYS file here:
>>> * https://dist.apache.org/repos/dist/dev/iceberg/KEYS
>>>
>>> Convenience binary artifacts are staged on Nexus. The Maven repository
>>> URL is:
>>> *
>>> https://repository.apache.org/content/repositories/orgapacheiceberg-1178
>>>
>>> Please download, verify, and test.
>>>
>>> Please vote in the next 72 hours.
>>>
>>> [ ] +1 Release this as Apache Iceberg 1.7.1
>>> [ ] +0
>>> [ ] -1 Do not release this because...
>>>
>>> Only PMC members have binding votes, but other community members are
>>> encouraged to cast
>>> non-binding votes. This vote will pass if there are 3 binding +1 votes
>>> and more binding
>>> +1 votes than -1 votes.
>>>
>>> (NOTE: The vote on 1.7.1 RC0 was skipped as a last minute bug fix came
>>> in.)
>>>
>>>
>>>
>


[DISCUSS] Simplify multi-arg table metadata

2025-02-03 Thread Driesprong, Fokko
Hi everyone,

While I was looking to add the V3 partition-spec (de/en)coder to PyIceberg,
I noticed that it allows for backporting the multi-arg transforms to V1 and
V2 tables as well. I think it would be good to avoid doing this and keep it
simple by only allowing this with V3 tables. I've raised a PR
. Please take a look and let
me know what you think.

Kind regards,
Fokko


[VOTE] Simplify multi-arg table metadata

2025-02-09 Thread Driesprong, Fokko
Hey everyone,

After the positive responses on the devlist
, I would
like to raise a vote to simplify the multi-argument transforms metadata,
and make it exclusve

A vote to simplify the


Re: [DISCUSS] PyIceberg 0.9.0 release

2025-02-21 Thread Driesprong, Fokko
Hey everyone,

Thanks everyone for being on top of 0.9.0. I've done some testing and found
a few more small improvements

that would be great to get into 0.9.0.

Kind regards,
Fokko

Op do 20 feb 2025 om 11:52 schreef Fokko Driesprong :

> Thanks Kevin for raising this!
>
>- For the first issue, I think we should remove the deprecation
>message and throw an error when the table reference is used.
>   - I don't think we should throw, I've left a comment on the PR
>   .
>- For the second
>   - We should clean this one up, I took the liberty of creating a PR
>   .
>- For deprecating botocore
>   - I'm leaning towards undoing this deprecation since there is no
>   good alternative without going into complex configuration
>   . I also
>   created a PR for that
>   .
>
> Thanks everyone, and LMKWYT!
>
> Kind regards,
> Fokko
>
>
>
> Op wo 19 feb 2025 om 23:29 schreef Kevin Liu :
>
>> Hey folks,
>>
>> While working on the 0.9.0 release candidate, we went through the process
>> of finding deprecation which should be removed in the upcoming release
>> 
>> .
>>
>> For the 0.9.0 release, there are 3 deprecations marked for removal.
>> * Table name reference in scan expression
>> 
>>
>> * REST catalog client AUTH_URL
>> 
>>
>> * Deprecate botocore session
>> 
>>
>>
>> For the first issue, I think we should remove the deprecation message and
>> throw an error when the table reference is used.
>> For the second and third issue, I think we should push back the removal
>> to the next release since we don't have a good solution yet.
>>
>> Would love to hear what others think about this approach. Please let us
>> know if there are any concerns.
>>
>> Best,
>> Kevin Liu
>>
>> On Tue, Feb 18, 2025 at 7:11 AM Fokko Driesprong 
>> wrote:
>>
>>> Hey Kevin,
>>>
>>> Thanks for raising this. That sounds like a great idea to me, and thanks
>>> Drew for being the release manager for 0.9.0.
>>>
>>> Kind regards,
>>> Fokko
>>>
>>> Op ma 17 feb 2025 om 23:41 schreef Kevin Liu :
>>>
 Thanks for volunteering! I'm happy to assist in any way I can. Let's
 coordinate on Slack :)

 Quick follow up on the commit hash pinned above, it's meant as a
 reference point and not the absolute cutoff. In fact, we merged a few PRs
 today that will be included in 0.9.0. Please chime in here if there are
 other PRs that you think should be part of this upcoming release.

 Best,
 Kevin Liu

 On Mon, Feb 17, 2025 at 9:35 AM Drew  wrote:

> Hey Kevin,
>
> Thanks for kicking this off. It’s exciting to see how much PyIceberg
> has been evolving!  I’d be happy to take on the Release Manager role for
> this release! I think it would be a good opportunity to try out the new
> release process documentation.
>
> Thanks,
>
> Drew
> On Mon, Feb 17, 2025 at 8:55 AM Kevin Liu 
> wrote:
>
>> Hi everyone,
>>
>> It's been a while since we released a new version of PyIceberg! The
>> last minor release (0.8.0) was on November 18, and the most recent patch
>> release (0.8.1) was on December 6. Time flies! There have been >200
>> commits since 0.8.0
>> 
>>  and
>> A LOT of great features contributed by the community
>>
>> I propose we cut a new release for version 0.9.0 based on the current
>> HEAD of the main branch, commit 300b840
>> 
>> .
>>
>> I'm happy to serve as the Release Manager again. But since I recently
>> rewrote the "
>> How
>> to Release" documentation
>> ,
>> it would be great if someone else could follow the updated instructions 
>> to
>> ensure they are clear.
>>
>> I've started a thread on Slack
>> 
>>  to
>> gather feedback from the 

Re: [VOTE] Release Apache Iceberg 1.8.1 RC1

2025-02-25 Thread Driesprong, Fokko
+1 (binding)

   - Checked signatures and checksum
   - Checked licenses
   - Spotchecked NOTICE/LICENSE

Kind regards,
Fokko

Op di 25 feb 2025 om 16:56 schreef Kevin Liu :

> +1 (non-binding)
>
> I followed "How to Verify a Release"
> .
> Checked out artifact from SVN,
> ```
> svn checkout
> https://dist.apache.org/repos/dist/dev/iceberg/apache-iceberg-1.8.1-rc1/ .
> ```
>
> Verified
> * Signature Good
> * Checksum Ok
> * RAT check passed. 1 unrelated error message
> ```
> ERROR: Ignored 0 lines in your exclusion files as comments or empty lines.
> ```
> * Build + test passed, running on Java 17.0.6 (openjdk 17.0.6 2023-01-17
> LTS) on M1
> * Ran a few examples on Spark
> * Ran pyiceberg integration tests,
> https://github.com/kevinjqliu/iceberg-python/pull/11
>
> I ran the tests both with and without the docker daemon. Without docker, a
> few tests failed in `iceberg-aws`, `iceberg-azure`, and
> `iceberg-kafka-connect`. There's already an issue to track this at
> https://github.com/apache/iceberg/issues/12236.
> I'm also continuing to see the flakey test for `iceberg-core`'s
> `testConcurrentFastAppends` test. I believe this is a local issue with my
> machine.
>
> Thanks for running the release, Eduard!
>
> Best,
> Kevin Liu
>
> On Tue, Feb 25, 2025 at 4:23 AM Jean-Baptiste Onofré 
> wrote:
>
>> +1 (non binding)
>>
>> - Hash and checksum are good
>> - LICENSE and NOTICE are OK in different distributed artifacts (source
>> distribution, aws bundle, etc)
>> - ASF header present in all expected files
>> - No binary file found in the source distribution
>> - Did quick smoke tests
>>
>> Thanks,
>> Regards
>> JB
>>
>> On Mon, Feb 24, 2025 at 1:46 PM Eduard Tudenhoefner
>>  wrote:
>> >
>> > Hi Everyone,
>> >
>> > I propose that we release the following RC as the official Apache
>> Iceberg 1.8.1 release.
>> >
>> > The commit ID is 9ce0fcf0af7becf25ad9fc996c3bad2afdcfd33d
>> > * This corresponds to the tag: apache-iceberg-1.8.1-rc1
>> > * https://github.com/apache/iceberg/commits/apache-iceberg-1.8.1-rc1
>> > *
>> https://github.com/apache/iceberg/tree/9ce0fcf0af7becf25ad9fc996c3bad2afdcfd33d
>> >
>> > The release tarball, signature, and checksums are here:
>> > *
>> https://dist.apache.org/repos/dist/dev/iceberg/apache-iceberg-1.8.1-rc1
>> >
>> > You can find the KEYS file here:
>> > * https://downloads.apache.org/iceberg/KEYS
>> >
>> > Convenience binary artifacts are staged on Nexus. The Maven repository
>> URL is:
>> > *
>> https://repository.apache.org/content/repositories/orgapacheiceberg-1184/
>> >
>> > Please download, verify, and test.
>> >
>> > Please vote in the next 72 hours.
>> >
>> > [ ] +1 Release this as Apache Iceberg 1.8.1
>> > [ ] +0
>> > [ ] -1 Do not release this because...
>> >
>> > Only PMC members have binding votes, but other community members are
>> encouraged to cast
>> > non-binding votes. This vote will pass if there are 3 binding +1 votes
>> and more binding
>> > +1 votes than -1 votes.
>> >
>>
>