Re: Seeking Input on Handling Ambiguity in Generating Changelogs

2023-04-24 Thread Yufei Gu
> > Two rows are the “same”—that is, the rows represent the same entity—if the > identifier fields are equal. However, uniqueness of rows by this identifier > is not guaranteed or required by Iceberg and it is the responsibility of > processing engines or data providers to enforce. Based on the ab

Re: What is the harm of adding partition to iceberg table?

2023-04-24 Thread Fokko Driesprong
Hi ZC C, Adding partitions to Iceberg tables is easy, and changing them, later on, is easy as well. The existing data will continue to exist with the partition that it was initially written with, new data will be written according to the active partitioning. When you rewrite the data (for example

What is the harm of adding partition to iceberg table?

2023-04-24 Thread ZC C
We now are create a row data table, and my colleague want to add org_id as the partition, What is the harm of adding partition to iceberg table?

Re: [DISCUSS] Switch to JDK 11 for releases?

2023-04-24 Thread Jack Ye
I agree, it wouldn't help given the fact that we won't be able to use the features in newer JDKs. However, I think there is still a difference in the artifact compiled by JDK8, vs compiled by JDK11 with --release=8, that might be useful. For example, I came across this try with resource introduces

Re: [DISCUSS] Spark 3.1 support?

2023-04-24 Thread Edgar Rodriguez
Hi all, Thanks for the discussion. Similarly to Manu, we're in Spark 3.1.1 and Iceberg 1.1.0 - we backport Spark 3.1.1 fixes internally as well. It's a bit more complicated to move fast on Spark versions internally, mainly due to the number of scala customers that we have. I understand maintainin

Reading/Streaming Iceberg tables using PyFlink's DataStream API

2023-04-24 Thread Agrawal, Sanket
Hi, We want to read an Iceberg table using PyFlink's Data Stream API. This will be in streaming mode. We know it is possible using Flink's JAVA API as mentioned in the link

Re: [DISCUSS] Switch to JDK 11 for releases?

2023-04-24 Thread Zoltán Borók-Nagy
Besides Hive, neither Impala is compatible with Java11 right now. This work is in-progress: https://issues.apache.org/jira/browse/IMPALA-11360 - Zoltan On Mon, Apr 24, 2023 at 11:07 AM Mass Dosage wrote: > I agree with Ryan, unless you can change the source version there's not > that much point

Re: [DISCUSS] Switch to JDK 11 for releases?

2023-04-24 Thread Mass Dosage
I agree with Ryan, unless you can change the source version there's not that much point. On the Hive front, as you can see from that ticket it's been open for 4(!) years and hasn't received much action recently. I think it's one of the reasons AWS EMR still defaults to Java 8. It would be really g