Re: Welcome new PMC members!

2023-04-11 Thread OpenInx
Congrats ! On Wed, Apr 12, 2023 at 10:25 AM Junjie Chen wrote: > Congratulations to all of you! > > On Wed, Apr 12, 2023 at 10:07 AM Reo Lei wrote: > >> Congratulations!!! >> >> yuxia 于2023年4月12日周三 09:19写道: >> >>> Congratulations to all! >>> >>> Best regards, >>> Yuxia >>> >>>

Re: RFC: Control flink upsert sink’s memory usage of insertedRowMap

2023-12-10 Thread OpenInx
Just provided a little context: there was a stale PR which was trying to maintain the insertedRowMap into RocksDB.. On Sat, Dec 9, 2023 at 1:52 AM Ryan Blue wrote: > Thanks, Renjie! > > The option to use Flink's state tracking system seems like a good idea to > me. > > On Thu, Dec 7, 2023 at 8:1

Re: RFC: Control flink upsert sink’s memory usage of insertedRowMap

2023-12-10 Thread OpenInx
https://github.com/apache/iceberg/pull/2680/files On Mon, Dec 11, 2023 at 11:15 AM OpenInx wrote: > Just provided a little context: there was a stale PR which was trying to > maintain the insertedRowMap into RocksDB.. > > On Sat, Dec 9, 2023 at 1:52 AM Ryan Blue wrote: > &g

Spark cannot read iceberg tables which were originally written by Impala

2023-12-25 Thread OpenInx
Hi dev Sensordata [1] had encountered an interesting Apache Impala & Iceberg bug in their real customer production environment. Their customers use Apache Impala to create a large mount of Apache Hive tables in HMS, and ingested PB-level dataset in their hive table (which were originally written b

Re: Spark cannot read iceberg tables which were originally written by Impala

2024-01-03 Thread OpenInx
imit the STRING >> type to actual string data. This approach does not fix already written >> files, as you already pointed out. >> >> Approach C: Migration job could copy data files but rewrite file >> metadata, if needed. This makes migration slower, but it's probably

Re: Spark cannot read iceberg tables which were originally written by Impala

2024-01-03 Thread OpenInx
set PARQUET_ANNOTATE_STRINGS_UTF8 > for themselves. > > Approach C: Yeah, if Approach A goes through then we don't really need to > bother with this. > > Cheers, > Zoltan > > > On Wed, Jan 3, 2024 at 2:02 PM OpenInx wrote: > >> Thanks Zoltan and

Re: [ANNOUNCE] New committer: Honah J.

2024-01-14 Thread OpenInx
Congrats, Honah ! On Sun, Jan 14, 2024 at 1:25 AM Jun H. wrote: > Congratulations! > > On Jan 12, 2024, at 10:12 PM, Péter Váry > wrote: > >  > Congratulations! > > On Sat, Jan 13, 2024, 06:26 Jean-Baptiste Onofré wrote: > >> Congrats ! >> >> Regards >> JB >> >> Le ven. 12 janv. 2024 à 22:11,

Re: Iceberg tombstone?

2020-01-14 Thread OpenInx
Hi Filip So you team have implemented the tombstone feature in your internal branch. For my understanding, the tombstone you mean is similar to the delete marker in HBase, so you're trying to implement the update/delete feature I think. For this part, Anton and Miguel have a design doc for this [

Re: TableMetadata.buildReplacement() adds old spec list

2020-02-05 Thread OpenInx
Hi I guess we shouldn't do the change, because the immutable partition spec list seems try to maintain all the history specs, not only the latest version. If we change here, then would break the specs & specsById definition in TableMetadata. > This change is breaking TestReplaceTransaction.testRe

Re: File leaking in RemoveSnapshots!

2020-02-26 Thread OpenInx
Hi Ashish It's indeed a bug for my understanding, I read your idea about the transaction hook. removing the data & manifest file of expired snapshots should happen after writing version-hint file (otherwise there will be some readers accessing snapshots which we are deleting data files). so the h

Re: upsert base on copy on write mode

2020-03-03 Thread OpenInx
I think we should abstract the API firstly, then implement the MOR. COW is also a necessary implementation, but it's easy to implement and no so urgent. On Tue, Mar 3, 2020 at 3:45 PM Junjie Chen wrote: > Thanks, Ryan > > Maybe the discussion is very clear before. Actually, we have built an > in

Re: Has the topic of CDC (change data capture) been considered for Iceberg? If not, should it?

2020-03-12 Thread OpenInx
Hi Filip We (alibaba & tencent) are doing the apache iceberg row-level update/deletes POC, syncing the change log (such as row-level binlog) into iceberg data lake is the classic case we are trying to implement (another classic case would be one streaming job or batch job with one or more update

Re: Iceberg articles for you

2020-03-12 Thread OpenInx
Great work, Junjie. Maybe we could add a tab named [blog] under https://iceberg.apache.org/ and put those English version posts under there, so that people from world wide could read them (also indexed them by search engine). Thanks. On Fri, Mar 13, 2020 at 9:55 AM Junjie Chen wrote: > Hi devs

Re: File leaking in RemoveSnapshots!

2020-03-17 Thread OpenInx
data to leak. > For example, an executor failure in Spark can lead to a data file getting > written by not committed to a table. I recommend having a way to > periodically list table locations and check against the metadata tables, > like all_data_files and all_manifests, to en

Re: Shall we start a regular community sync up?

2020-03-18 Thread OpenInx
+1 On Wed, Mar 18, 2020 at 10:30 PM Saisai Shao wrote: > Hi team, > > With more companies and developers joining in the community, I was > wondering if we could have regular sync up to discuss anything about > Iceberg, like milestone, feature design, etc. I think this will be quite > helpful to

What have I learned from doing Merge-On-Read PoC

2020-03-21 Thread OpenInx
https://github.com/openinx/incubator-iceberg/pull/5/files [3]. https://docs.google.com/document/d/1CPFun2uG-eXdJggqKcPsTdNa2wPMpAdw8loeP-0fm_M/edit?usp=sharing

Re: What have I learned from doing Merge-On-Read PoC

2020-03-23 Thread OpenInx
/apache/incubator-iceberg/compare/master...chenjunjiedada:row-level-delete#diff-c168df8c9739650eab655b22b0b549acR407 [4]. https://github.com/apache/incubator-iceberg/compare/master...chenjunjiedada:row-level-delete#diff-fffa37e29d3736de086cbd23094865b7R63 On Sun, Mar 22, 2020 at 8:49 PM Junjie Chen

Re: Shall we start a regular community sync up?

2020-03-23 Thread OpenInx
Hi Ryan I received your invitation. Some guys from our Flink teams also want to join the hangouts meeting. Do we need also send an extra invitation to them ? Or could them just join the meeting with entering the meeting address[1] ? If need so, please let the following guys in: 1. ykt...@gmail.

Open a new branch for row-delete feature ?

2020-03-27 Thread OpenInx
Dear Dev: Tuesday, we had a sync meeting. and discussed about the things: 1. cut the 0.8.0 release; 2. flink connector ; 3. iceberg row-level delete; 4. Map-Reduce Formats and Hive support. We'll release version 0.8.0 around April 15, the followin

Re: Iceberg community sync - 2020-03-25

2020-03-28 Thread OpenInx
er if he has the bandwidth. There're some flink committers and PMC in our flink team, we could also ping them. > Openinx brought up concerns about minimizing end-to-end latency Agreed that we could implement the file/pos deletes and equality-deletes firstly. The off-line optimization seems

Re: Open a new branch for row-delete feature ?

2020-03-30 Thread OpenInx
needs to be done -- like creating readers and writers for >> diff formats -- can be done in master. >> >> rb >> >> On Mon, Mar 30, 2020 at 9:00 AM Gautam wrote: >> >>> Thanks for bringing this up OpenInx. That's a great idea: to open a >>>

Re: Open a new branch for row-delete feature ?

2020-03-31 Thread OpenInx
Miao > > > > *From: *Ryan Blue > *Reply-To: *"dev@iceberg.apache.org" , " > rb...@netflix.com" > *Date: *Tuesday, March 31, 2020 at 10:08 AM > *To: *OpenInx > *Cc: *Iceberg Dev List > *Subject: *Re: Open a new branch for row-delete feature ? > > &g

Re: Open a new branch for row-delete feature ?

2020-04-01 Thread OpenInx
tstanding. Maybe we will choose to add a type column >like this, but I’d like to have a design in mind before we merge these PRs. >Thinking through this and coming up with a proposal here is the next >priority for this work, because it will unlock more tasks we can do in >

Re: Iceberg community sync notes - 15 April 2020

2020-04-16 Thread OpenInx
Thanks for the writing. The views from Netflix branch is a great feature, would have any plan to port to Apache Iceberg ? On Fri, Apr 17, 2020 at 5:31 AM Ryan Blue wrote: > Here are my notes from yesterday’s sync. As usual, feel free to add to > this if I missed something. > > There were a coupl

Re: [VOTE] Release Apache Iceberg 0.8.0-incubating RC2

2020-04-30 Thread OpenInx
I checked the rc2, seems the TestHiveTableConcurrency is broken, may need to fix it. 1. Download the tarball and check the signature & checksum: OK 2. license checking: RAT checks passed. 3. Build and test the project (java8): org.apache.iceberg.hive.TestHiveTableConcurrency > testConcurrentConnec

Re: [DISCUSS] Changes for row-level deletes

2020-05-05 Thread OpenInx
The two-phrase approach sounds good to me. the precondition is we have limited number of delete files so that memory can hold all of them, we will have the compaction service to reduce the delete files so it seems not a problem.

Re: [DISCUSS] Changes for row-level deletes

2020-05-05 Thread OpenInx
t so we plan to do the PoC. [1]. https://github.com/generic-datalake/iceberg [2]. https://github.com/generic-datalake/iceberg/tree/master/flink/src On Wed, May 6, 2020 at 11:44 AM OpenInx wrote: > The two-phrase approach sounds good to me. the precondition is we have > limited number of delete

Re: [VOTE] Graduate to a top-level project

2020-05-12 Thread OpenInx
+1 for graduation. It's a great news that we've prepared to graduate. (non-binding). On Wed, May 13, 2020 at 9:50 AM Saisai Shao wrote: > +1 for graduation. > > Junjie Chen 于2020年5月13日周三 上午9:33写道: > >> +1 >> >> On Wed, May 13, 2020 at 8:07 AM RD wrote: >> >>> +1 for graduation! >>> >>> On Tu

[Doc] Streaming CDC in Iceberg

2020-06-28 Thread OpenInx
Hi dev: We have a discussion about the equality-deletes here [1]. It seems more complex when considering the CDC events streaming to the iceberg table, so I prepared a document for further discussion here [2]. Any suggestions and feedback are welcome, thanks. [1]. https://github.com/apache/iceb

Re: Iceberg V2 Spec

2020-07-01 Thread OpenInx
Hi Ryan: Just curious when do we plan to release 0.9.0 ? I expect that the flink connector could be included in release 0.9.0. Thanks. On Thu, Jul 2, 2020 at 12:14 AM Ryan Blue wrote: > Hi Chen, > > Right now, the main parts of the v2 spec are the addition of sequence > numbers and delete fil

Re: Iceberg V2 Spec

2020-07-02 Thread OpenInx
y, just like we would do for Spark 3 support. > > Does that sound reasonable? > > On Wed, Jul 1, 2020 at 7:39 PM OpenInx wrote: > >> Hi Ryan: >> >> Just curious when do we plan to release 0.9.0 ? I expect that the flink >> connector could be included in rel

Re: [VOTE] Release Apache Iceberg 0.9.0 RC5

2020-07-09 Thread OpenInx
I followed the verify guide here ( https://lists.apache.org/thread.html/rd5e6b1656ac80252a9a7d473b36b6227da91d07d86d4ba4bee10df66%40%3Cdev.iceberg.apache.org%3E) : 1. Verify the signature: OK 2. Verify the checksum: OK 3. Untar the archive tarball: OK 4. Run RAT checks to validate license headers:

Re: New committer: Shardul Mahadik

2020-07-22 Thread OpenInx
Congratulations ! On Thu, Jul 23, 2020 at 9:31 AM Jingsong Li wrote: > Congratulations Shardul! Well deserved! > > Best, > Jingsong > > On Thu, Jul 23, 2020 at 7:27 AM Anton Okolnychyi > wrote: > >> Congrats and welcome! Keep up the good work! >> >> - Anton >> >> On 22 Jul 2020, at 16:02, RD w

Re: Iceberg community sync notes - 29 July 2020

2020-08-02 Thread OpenInx
. Once we've finished the flink DataStream iceberg sink, we will create PRs to make the flink table sql work. 3. Flink streaming reader / batch reader etc. > Kyle: I’ll be interested to review. Thanks for your time to review those PR. > It seems like points raised by @openinx in th

Re: [DISCUSS] 0.9.1 release

2020-08-03 Thread OpenInx
> Does anyone know if we can recover existing data affected by it? In the PR #1271, there are two data types which have correctness bugs: decimal18 and timestampZone. For decimal18, we actually write the correct decimal value, but read it in an incorrect way. saying the decimal(10,3) and value =

Re: [DISCUSS] August board report

2020-08-12 Thread OpenInx
> Community members gave 2 Iceberg talks at Subsurface Conf, on enabling Hive queries against Iceberg tables and working with petabyte-scale Iceberg tables. Iceberg was also mentioned in the keynotes. Are there slides or videos about the two iceberg talks ? I'd like to read/watch slides or videos

Re: [DISCUSS] August board report

2020-08-13 Thread OpenInx
ecordings > -- > Jacques Nadeau > CTO and Co-Founder, Dremio > > > On Wed, Aug 12, 2020 at 7:07 PM OpenInx wrote: > >> > Community members gave 2 Iceberg talks at Subsurface Conf, on enabling >> Hive >> queries against Iceberg tables and working with petabyte-sca

Re: Timestamp Based Incremental Reading in Iceberg ...

2020-09-08 Thread OpenInx
I agree that it's helpful to allow users to read the incremental delta based timestamp, as Jingsong said timestamp is more friendly. My question is how to implement this ? If just attach the client's timestamp to the iceberg table when committing, then different clients may have different tim

Re: Iceberg V2 Spec

2020-09-20 Thread OpenInx
> in >> a few update emails related, but it only covers one part. >> >> Chen >> >> On Thu, Jul 2, 2020 at 9:53 PM OpenInx wrote: >> >>> Sounds good to me. >>> >>> Thanks. >>> >>> On Fri, Jul 3, 2020 at 12:58 AM Rya

Re: Incremental reads for Upsert!

2020-10-19 Thread OpenInx
m. Right now, > all of the readers produce records from the current tables state. I think > @OpenInx and @Jingsong Li have > some plans to expose such a reader for Flink, though. Maybe they can work > with you to on some milestones and a roadmap. > > rb > > On Fri,

Several flink pull requests need to get merged before the next release 0.10.0

2020-10-19 Thread OpenInx
Hi As we know that we next release 0.10.0 is coming, there are several issues which should be merged as soon as possible in my mind: 1. https://github.com/apache/iceberg/pull/1477 It will change the flink state design to maintain the complete data files into manifest before checkpoint finished,

Re: Several flink pull requests need to get merged before the next release 0.10.0

2020-10-27 Thread OpenInx
rb > > On Mon, Oct 19, 2020 at 7:15 PM OpenInx wrote: > >> Hi >> >> As we know that we next release 0.10.0 is coming, there are several >> issues which should be merged as soon as possible in my mind: >> >> 1. https://github.com/apache/iceberg/pull/1

Plans for the future iceberg 0.11.0 release

2020-10-28 Thread OpenInx
Hi dev As we know, we will be happy to cut the iceberg 0.10.0 candidate release this week. I think it may be the time to plan for the future iceberg 0.11.0 now, so I created a Java 0.11.0 Release milestone here [1] I put the following issues into the newly created milestone: 1. Apache Flink

Re: Plans for the future iceberg 0.11.0 release

2020-11-01 Thread OpenInx
Thanks for your context about FLIP-27, Steven ! I will take a look for the patches under issues 1626. On Sat, Oct 31, 2020 at 2:03 AM Steven Wu wrote: > OpenInx, thanks a lot for kicking off the discussion. Looks like my > previous reply didn't reach the mailing list. > > >

Re: [VOTE] Release Apache Iceberg 0.10.0 RC2

2020-11-01 Thread OpenInx
+1 for 0.10.0 RC2 1. Download the source tarball, signature (.asc), and checksum (.sha512): OK 2. Import gpg keys: download KEYS and run gpg --import /path/to/downloaded/KEYS (optional if this hasn’t changed) : OK 3. Verify the signature by running: gpg --verify apache-iceberg-xx-incubating.tar.

Re: Plans for the future iceberg 0.11.0 release

2020-11-02 Thread OpenInx
les using Hive DDL without needing to pass a > JSON-serialized schema that would be good to get in, and I think it would > be good to get the basic write path committed as well. > > On Sun, Nov 1, 2020 at 5:57 PM OpenInx wrote: > >> Thanks for your context about FLIP-27, Steven

Re: [VOTE] Release Apache Iceberg 0.10.0 RC2

2020-11-03 Thread OpenInx
>> On Mon, Nov 2, 2020 at 2:28 PM Mass Dosage >>>> wrote: >>>> >>>>> +1 (non-binding) >>>>> >>>>> I ran the RC against a set of integration tests I have for a subset of >>>>> the Hive2 read functionali

Re: [VOTE] Release Apache Iceberg 0.10.0 RC2

2020-11-03 Thread OpenInx
r the attention. On Wed, Nov 4, 2020 at 1:31 AM Ryan Blue wrote: > OpenInx, is that a general question or is it related to the release? It > doesn't look related, but I want to make sure. > > On Tue, Nov 3, 2020 at 5:41 AM OpenInx wrote: > >> Hi >> >>

Re: [VOTE] Release Apache Iceberg 0.10.0 RC4

2020-11-03 Thread OpenInx
+1 for 0.10.0 RC4 1. Download the source tarball, signature (.asc), and checksum (.sha512): OK 2. Import gpg keys: download KEYS and run gpg --import /path/to/downloaded/KEYS (optional if this hasn’t changed) : OK 3. Verify the signature by running: gpg --verify apache-iceberg-xx.tar.gz.asc: OK

What's the time to expose iceberg format v2 to end users ?

2020-12-16 Thread OpenInx
Hi I wrote this email to align with the community about the time to expose format v2 to end users. In iceberg format v2, we've accomplished the row-level delete. It's designed for two user cases: 1. Execute a single query to update or delete lots of rows. It's a typical batch update/delete j

Re: What's the time to expose iceberg format v2 to end users ?

2020-12-18 Thread OpenInx
Thanks Yan for the document, I will take a look at it, and see what I can do. On Fri, Dec 18, 2020 at 3:38 AM Yan Yan wrote: > Hi OpenInx, > > Thanks for bringing this up. I am currently working on Format v2 blocking > tasks, and am maintaining a full list of blocking task

Re: how to test row level delete

2020-12-27 Thread OpenInx
Hi liubo07199 Thanks for testing the iceberg row-level delete, I skimmed the code, it seems you were trying the equality-delete feature. For iceberg users, I think we don't have to write those iceberg internal codes to get this work, this isn't friendly for users. Instead, we usually use the e

Re: how to test row level delete

2020-12-27 Thread OpenInx
> you can apply this patch in your own repository The patch is : https://github.com/apache/iceberg/pull/1978 On Mon, Dec 28, 2020 at 10:32 AM OpenInx wrote: > Hi liubo07199 > > Thanks for testing the iceberg row-level delete, I skimmed the code, it > seems you were trying the

Re: how to generate a new .v1.metadata.json.crc for v1.metadata.json

2020-12-27 Thread OpenInx
You edited the v1.metadata.json to support iceberg format v2 ? That's not the correct way to use iceberg format v2. let's discuss this issue in the latest email . On Sat, Dec 26, 2020 at 7:01 PM 1 wrote: > Hi, all: > >I vim the v1.metadata.json, so old .v1.metadata.json.crc is not >

Re: test flakiness with SocketException of broken pipe in HiveMetaStoreClient

2021-01-06 Thread OpenInx
I encountered a similar issue when supporting hive-site.xml for flink hive catalog. Here is the discussion and solution before: https://github.com/apache/iceberg/pull/1586#discussion_r509453461 It's a connection leak issue. On Thu, Jan 7, 2021 at 10:06 AM Ryan Blue wrote: > I've noticed this

Re: test flakiness with SocketException of broken pipe in HiveMetaStoreClient

2021-01-07 Thread OpenInx
ithub.com/apache/iceberg/pull/2051/files On Fri, Jan 8, 2021 at 5:48 AM Steven Wu wrote: > Ryan/OpenInx, thanks a lot for the pointers. > > I was able to almost 100% reproduce the HiveMetaStoreClient aborted > connection problem locally with Flink tests after adding > another Dele

Re: test flakiness with SocketException of broken pipe in HiveMetaStoreClient

2021-01-08 Thread OpenInx
d you have any extra usage about the table loader and forget to close it in your flip-27 dev branch ? [1]. https://github.com/apache/iceberg/blob/7645ceba65044184be192a7194a38729133b2e50/flink/src/main/java/org/apache/iceberg/flink/source/FlinkInputFormat.java#L77 On Fri, Jan 8, 2021 at 3:36 PM Open

Re: Welcoming Peter Vary as a new committer!

2021-01-25 Thread OpenInx
Congratulations and welcome Peter ! On Tue, Jan 26, 2021 at 9:41 AM Junjie Chen wrote: > Congratulations! > > On Tue, Jan 26, 2021 at 8:26 AM Jun H. wrote: > >> Congratulations >> >> On Mon, Jan 25, 2021 at 4:18 PM Yan Yan wrote: >> > >> > Congratulations! >> > >> > On Mon, Jan 25, 2021 at 3:0

Re: [VOTE] Release Apache Iceberg 0.11.0 RC0

2021-01-25 Thread OpenInx
Hi dev I'd like to include this patch in release 0.11.0 because it's the document of new flink features. I'm sorry that I did not update the flink's document in time when the feature code merged, but I think it's worth it to merge this document PR when we release iceberg 0.11.0, that helps a lot

Re: Sync to discuss secondary index proposal

2021-01-28 Thread OpenInx
+1, my time zone is CST. On Fri, Jan 29, 2021 at 6:57 AM Xinli shang wrote: > I had some earlier discussion with Miao on this. I am still interested in > it. My time zone is PST. > > On Thu, Jan 28, 2021 at 2:50 PM Jack Ye wrote: > >> +1, looking forward to the discussion, please include me an

Re: Sync to discuss secondary index proposal

2021-01-28 Thread OpenInx
it?ts=601316b0# On Fri, Jan 29, 2021 at 10:16 AM 李响 wrote: > +1, my colleagues and I is at UTC+8 > > On Fri, Jan 29, 2021 at 9:50 AM OpenInx wrote: > >> +1, my time zone is CST. >> >> On Fri, Jan 29, 2021 at 6:57 AM Xinli shang >> wrote: >> >>> I h

Re: Sync to discuss secondary index proposal

2021-01-28 Thread OpenInx
Sorry I sent the wrong link, the secondary index document link is: https://docs.google.com/document/d/1E1ofBQoKRnX04bWT3utgyHQGaHZoelgXosk_UNsTUuQ/edit On Fri, Jan 29, 2021 at 10:31 AM OpenInx wrote: > Hi > > @Miao WangWould you mind to share your current PoC > code or

Re: Secondary Indexes - Pluggable File Filter interface for Apache Iceberg

2021-03-03 Thread OpenInx
It will be 1:00 AM (China Standard Time) on 18 March, and it works for our Asia people. I'd love to attend this discussion, Thanks. On Thu, Mar 4, 2021 at 9:50 AM Ryan Blue wrote: > Thanks for putting this together, Guy! I just did a pass over the doc and > it looks like a really reasonable

Sync: the progress of row-level delete

2021-03-14 Thread OpenInx
Hi iceberg dev: Currently, Junjie Chen and I have made some progress about the Rewrite Action for format v2. We will have two kinds of Rewrite Action: 1. The first one is rewriting equality delete rows into position delete rows. The PoC PR is here: https://github.com/apache/iceberg/pull/221

Re: When is the next release of Iceberg ?

2021-03-23 Thread OpenInx
Hi Himanshu Thanks for the email, currently we flink+iceberg support writing CDC events into apache iceberg table by flink datastream API, besides the spark/presto/hive could read those events in batch job. But there are still some issues that we do not finish yet: 1. Expose the iceberg v2 to

Re: Welcoming Yan Yan as a new committer!

2021-03-23 Thread OpenInx
Congrats Yan ! You deserve it. On Wed, Mar 24, 2021 at 7:18 AM Miao Wang wrote: > Congrats @Yan Yan ! > > > > Miao > > > > *From: *Ryan Blue > *Reply-To: *"dev@iceberg.apache.org" > *Date: *Tuesday, March 23, 2021 at 3:43 PM > *To: *Iceberg Dev List > *Subject: *Welcoming Yan Yan as a new c

Re: Welcoming Russell Spitzer as a new committer

2021-03-29 Thread OpenInx
Congrats, Russell ! Well-deserved ! On Tue, Mar 30, 2021 at 9:33 AM Junjie Chen wrote: > Congratulations, Russell! Nice work! > > On Tue, Mar 30, 2021 at 5:02 AM Daniel Weeks > wrote: > >> Congrats, Russell! >> >> On Mon, Mar 29, 2021 at 1:59 PM Ryan Blue >> wrote: >> >>> Congratulations, R

Re: Welcoming Ryan Murray as a new committer!

2021-03-29 Thread OpenInx
Congrats, Ryan ! Well-deserved ! On Tue, Mar 30, 2021 at 9:32 AM Junjie Chen wrote: > Congratulations. Ryan! > > On Tue, Mar 30, 2021 at 5:02 AM Daniel Weeks > wrote: > >> Congrats, Ryan and thanks for all the great work! >> >> On Mon, Mar 29, 2021 at 1:59 PM Ryan Blue >> wrote: >> >>> Congra

Re: When is the next release of Iceberg ?

2021-04-02 Thread OpenInx
Hi Himanshu If you want to try the flink + iceberg fo syncing mysql binlog to iceberg table, you might be interested in those PRs: 1. https://github.com/apache/iceberg/pull/2410 2. https://github.com/apache/iceberg/pull/2303 On Wed, Mar 24, 2021 at 10:34 AM OpenInx wrote: > Hi Himan

Re: how to test row level delete

2021-04-06 Thread OpenInx
w referenced in your > previous email. > > TableProperties.FORMAT_VERSION > > Can you suggest? I want to create a V2 table to test some row level > upserts/deletes. > > Chen > > On Sun, Dec 27, 2020 at 9:33 PM OpenInx wrote: > >> > you can apply this p

Re: Compaction Sync - Monday

2021-04-18 Thread OpenInx
Thanks for pinging me. I'd like to attend that meeting, but the time is April 20 at 00:00 AM Beijing time, Junjie and I may need to get up in the middle of the night to attend this meeting. It would be better if we could adjust the time, but if everyone has recognized this point in time, we ca

Re: Compaction Sync - Monday

2021-04-18 Thread OpenInx
o the original event! > Russ > > On Apr 18, 2021, at 9:00 PM, OpenInx wrote: > > Thanks for pinging me. > > I'd like to attend that meeting, but the time is April 20 at 00:00 AM > Beijing time, Junjie and I may need to get up in the middle of the night > to attend

Re: Stableness of V2 Spec/API

2021-05-17 Thread OpenInx
Hi Huadong >From the perspective of iceberg developers, we don't expose the format v2 to end users because we think there is still other work that needs to be done. As you can see there are still some unfinished issues from your link. As for whether v2 will cause data loss, from my perspective as

Re: Stableness of V2 Spec/API

2021-05-17 Thread OpenInx
uadong Liu wrote: > Thanks. Compaction is https://github.com/apache/iceberg/pull/2303 and it > is currently blocked by https://github.com/apache/iceberg/issues/2308? > > On Mon, May 17, 2021 at 6:17 PM OpenInx wrote: > >> Hi Huadong >> >> From the perspective of iceber

Re: Welcoming OpenInx as a new PMC member!

2021-06-29 Thread OpenInx
PM wgcn.bj wrote: > >> Congrats! >> >> 原始邮件 >> *发件人:* Dongjoon Hyun >> *收件人:* dev >> *发送时间:* 2021年6月30日(周三) 10:05 >> *主题:* Re: Welcoming OpenInx as a new PMC member! >> >> Congratulations! >> >> Dongjoon. >> >

Re: Welcoming Jack Ye as a new committer!

2021-07-06 Thread OpenInx
Congrats, Jack ! On Wed, Jul 7, 2021 at 7:40 AM Miao Wang wrote: > Congratulations! > > Miao > > Sent from my iPhone > > On Jul 5, 2021, at 4:14 PM, Daniel Weeks wrote: > >  > Great work Jack, Congratulations! > > On Mon, Jul 5, 2021 at 1:21 PM karuppayya > wrote: > >> Congratulations Jack! >

Re: [VOTE] Adopt the v2 spec changes

2021-07-27 Thread OpenInx
> adopt the pending v2 spec changes as the supported v2 spec I assume this vote wants to reach the consistency between the community members that we won't introduce any breaking changes in v2 spec, not discuss exposing v2 to SQL tables like the following, right ? CREATE TABLE prod.db.sample (

Re: Iceberg community sync notes for 1 September 2021

2021-09-08 Thread OpenInx
Thanks for the summary, Ryan ! I would like to add the following thing into the roadmap for 0.13.0: *Flink Integration* 1. Upgrade the flink version from 1.12.1 to 1.13.2 ( https://github.com/apache/iceberg/pull/2629). Because there is a bug in flink 1.12.1 when reading nested data types (Map

Re: Iceberg community sync notes for 1 September 2021

2021-09-08 Thread OpenInx
, 2021 at 9:36 AM OpenInx wrote: > Thanks for the summary, Ryan ! > > I would like to add the following thing into the roadmap for 0.13.0: > > *Flink Integration* > > 1. Upgrade the flink version from 1.12.1 to 1.13.2 ( > https://github.com/apache/iceberg/pull/2629). >

Re: [DISCUSS] Spark version support strategy

2021-09-15 Thread OpenInx
Thanks for bringing this up, Anton. Everyone has great pros/cons to support their preferences. Before giving my preference, let me raise one question:what's the top priority thing for apache iceberg project at this point in time ? This question will help us to answer the following question:

Re: [DISCUSS] Iceberg roadmap

2021-09-18 Thread OpenInx
Thanks Steven & Kyle. Yes, the flip-27 source and flink 1.13.2 are orthogonal because the flink's flip-27 API was successfully introduced in flink 1.12 release ( https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface). The WIP flip-27 iceberg source proposed from

Re: can not use iceberg as a sql source in flink sql according to iceberg 0.12.0

2021-09-22 Thread OpenInx
Hi Joshua Can you check what's the parquet version you are using ? Looks like the line 112 in HadoopReadOptions is not the first line accessing the variables in ParquetInputFormat. [image: image.png] On Wed, Sep 22, 2021 at 11:07 PM Joshua Fan wrote: > Hi > I am glad to use iceberg as table

Re: [DISCUSS] Spark version support strategy

2021-09-28 Thread OpenInx
lots of time to upgrade >>>>>>>> Spark and >>>>>>>> picking up new Iceberg features. >>>>>>>> >>>>>>>> Another way of thinking about this is that if we went with option >>>>>

Re: [DISCUSS] Spark version support strategy

2021-10-07 Thread OpenInx
the desire to do it >> please reach out and coordinate with us! >> >> Ryan >> >> On Wed, Sep 29, 2021 at 9:12 PM Steven Wu wrote: >> >>> Wing, sorry, my earlier message probably misled you. I was speaking my >>> personal opinion on Flink versio

Re: Iceberg sync times

2021-10-09 Thread OpenInx
Thanks Ryan for bringing this up ! I attended several Iceberg syncs at 5 PM pacific time (9AM CST) and attended only one Iceberg sync at 9AM pacific time (1 AM CST), and have the following feelings: 1. We usually arrive at the office around 9:30AM to 10AM CST ( 5:30 PM ~ 6:00 PM pacific time).

Re: Snapshot tagging, branching and retention

2021-10-13 Thread OpenInx
Is it possible to maintain a meeting note for this and publish it to the mail list because I don't think everybody could attend this meeting ? Thanks. On Thu, Oct 14, 2021 at 2:00 AM Jack Ye wrote: > Hi everyone, > > Based on some offline discussions with different people around > availability,

Re: Meeting Minutes from 10/20 Iceberg Sync

2021-10-21 Thread OpenInx
Thanks for the detailed report ! One more thing: We now have made a lot of progress in integrating Alibaba Cloud (https://www.aliyun.com/), Please see https://github.com/apache/iceberg/projects/21 (Thanks @xingbowu - https://github.com/xingbowu). On Thu, Oct 21, 2021 at 11:30 PM Sam Redai wrote

Re: Iceberg 0.12.1 Patch Release - Call for Bug Fixes and Patches

2021-10-27 Thread OpenInx
I think we will need to fix this critical iceberg bug before we release the 0.12.1: https://github.com/apache/iceberg/issues/3393 . Let's mark it as a blocker for the 0.12.1. On Fri, Oct 22, 2021 at 3:22 AM Kyle Bendickson wrote: > Thank you everybody for the additional PRs brought up so far. >

Re: Iceberg 0.12.1 Patch Release - Call for Bug Fixes and Patches

2021-10-27 Thread OpenInx
t to > get it into 0.12.1. > > What does everyone else think? Should we wait for this Hive fix? > > On Wed, Oct 27, 2021 at 3:17 AM OpenInx wrote: > >> I think we will need to fix this critical iceberg bug before we release >> the 0.12.1: https://github.com/apache/iceb

Re: [DISCUSS] Iceberg roadmap

2021-10-31 Thread OpenInx
Update: I think the project [Flink: Upgrade to 1.13.2][1] in RoadMap can be closed now, because all of the issues have been addressed. [1]. https://github.com/apache/iceberg/projects/12 On Tue, Sep 21, 2021 at 6:17 PM Eduard Tudenhoefner wrote: > I created a Roadmap section in https://github.

Re: [DISCUSS] Iceberg roadmap

2021-11-04 Thread OpenInx
ade project and marked the FLIP-27 project priority 1. > Thanks for all the work to get this done! > > On Sun, Oct 31, 2021 at 8:10 PM OpenInx wrote: > >> Update: >> >> I think the project [Flink: Upgrade to 1.13.2][1] in RoadMap can be >> closed now, because all of t

Re: [VOTE] Release Apache Iceberg 0.12.1 RC0

2021-11-05 Thread OpenInx
validate license headers: dev/check-license: OK 7. Build and test the project: ./gradlew build (use Java 8) : OK 8. Check the flink works fine by the following command line: ./bin/sql-client.sh embedded -j /Users/openinx/Downloads/apache-iceberg-0.12.1/flink-runtime/build/libs/iceberg-flink

Re: [DISCUSS] Iceberg roadmap

2021-11-07 Thread OpenInx
Any thoughts for adding StarRocks integration to the roadmap ? I think the guys from StarRocks community can provide more background and inputs. On Thu, Nov 4, 2021 at 5:59 PM OpenInx wrote: > Update: > > StarRocks[1] is a next-gen sub-second MPP database for full analysis &g

Re: Upcoming Iceberg Community Sync (11/17 9:00am PT)

2021-11-16 Thread OpenInx
Let me give more inputs from my perspective. 1. Fixed few critical flink v2 reader bugs: a. The flink avro reader bug: https://github.com/apache/iceberg/pull/3540 b. v2's extra meta columns messed up the flink's RowData pos: * https://github.com/apache/iceberg/pull/35

Re: Upcoming Iceberg Community Sync (11/17 9:00am PT)

2021-11-16 Thread OpenInx
* Another related PR to enhance the unit tests is: https://github.com/apache/iceberg/pull/3477 (Need someone to review & merge this). On Wed, Nov 17, 2021 at 10:03 AM OpenInx wrote: > Let me give more inputs from my perspective. > > 1. Fixed few critical flink v2 reader bugs

Re: Welcome new PMC members!

2021-11-17 Thread OpenInx
Congrats, Jack and Russell ! Well deserved ! On Thu, Nov 18, 2021 at 9:08 AM karuppayya wrote: > Congratulations Russell and Jack!! > > - Karuppayya > > On Wed, Nov 17, 2021 at 5:02 PM Yufei Gu wrote: > >> Congratulations, Jack and Russell! >> >> Best, >> >> Yufei >> >> `This is not a contrib

Re: Vendor integration strategy

2021-12-09 Thread OpenInx
directly in the classpath to avoid such need in the very near >> future, and EMR will maintain their AWS SDK version upgrade independently. >> >> But the approach proposed by Aliyun seems to fit the use case of Aliyun >> users better. For more context, please read >> ht

Re: Vendor integration strategy

2021-12-12 Thread OpenInx
https://github.com/apache/iceberg/pull/3725 The usage example is here: https://github.com/apache/iceberg/pull/3725#issue-800973927 We can vote for option#1 or option#2. Any feedback is welcome, thanks in advance. On Thu, Dec 9, 2021 at 8:29 PM OpenInx wrote: > Thanks Jack for bringing this up, an

Re: Vendor integration strategy

2021-12-13 Thread OpenInx
accessing Aliyun oss services. Thanks. On Tue, Dec 14, 2021 at 9:13 AM Jack Ye wrote: > Thank you Openinx for preparing all these PRs and the vote options! > > In the community sync, we also talked about not including any new vendor > integration modules in engine runtimes. In this appr

Re: New Versioned Iceberg Documentation Site

2022-02-06 Thread OpenInx
The new site looks great to me, thanks all for the work ! One unrelated thing: I remember we had a discussion to bring a new page in the doc site to collect all the design docs (such as google doc, github issues etc), is there any progress for this thing ? Someone who connected to me has raise

  1   2   >