Hi, Community I want to start a preliminary discussion about increment the parallel of flink connectors with optional transaction supporting. looking forward to your feedback and advices.
*Blew are the three scenarios I summarized we should support:* 1. *unordered at least once*: this is for the store engine which mainly support append data, such as clickhouse. 2. *ordered for partition with optional flush when checkpoint*: this is for the store engine which support upsert but don't support transaction across rows. such hbase, elasticsearch. 3. *global ordered with optional transaction*: this is for the store engine which support transaction across rows. *here is the basic data write flow:* *1.base data sink flow:* [sink data] -------- | | ThreadSafe [timeout event] -------- | --------> [Ring Buffer] -------> [Sink Buffer Manager] -------> [trigger flush the data of sink buffer ] | [checkpoint event] -------- | *2.when to flush the data* a. if enable flush the data when checkpoint or transaction. just flush data when received the checkpoint event. b. exceed the max batch size; c. receive a timeout event; d. receive a checkpoint event; *3.how the sink buffer should be organized.* a. unordered at least once the sink buffer should be organized as a linkedList with size limit and flush the data concurrently. b. *ordered for partition with optional flush when checkpoint:* the sink buffer should be organized as a map, the key should be determinzed by parition key or primary key(primary key % sink parallel), the value should be a sink buffer list with size limit. we try move the data to sink buffer by parititioned key or primary key, and flush the data of different sink buffer list concurrently. c. *global ordered with optional transaction*: the sink buffer should be organized as a linkedList with size limit and flush the data serially *how the logical should be implement.* I think we should use the latest flink-143: Unified Sink API [1], and support split table. [1] https://cwiki.apache.org/confluence/display/FLINK/FLIP-143%3A+Unified+Sink+API -- *Best Regards* *Jeremy Mei*