Hi, Etienne. Thanks for Etienne for sharing this article. I really like it and learn much from it.
I'd like to raise some questions about implementing batch source. Welcome devs to share insights about them. The first question is how to generate splits: As the article mentioned: "Whenever possible, it is preferable to generate the splits lazily, meaning that each time a reader asks the enumerator for a split, the enumerator generates one on demand and assigns it to the reader. " I think it maybe not for all cases. In some cases, generating split may be time counsuming, then it may be better to generate a batch of splits on demand to amortize the expense. But it then raises another question, how many splits should be generated in a batch, too many maywell cause OOM, too less may not make good use of batch generating splits. To solve it, I think maybe we can provide a configuration to make user to configure how many splits should be generated in a batch. What's your opinion on it. Have you ever encountered this problem in your implementation? The second question is how to assign splits: What's your split assign stratgy? In flink, we provide `LocalityAwareSplitAssigner` to make use of locality to assign split to reader. But it may not perfert for the case of failover for which we intend to introduce another split assign strategy[1]. But I do think it should be configurable to enable advanced user to decide which assign stratgy to use. Welcome other devs to share opinion. [1]: https://issues.apache.org/jira/browse/FLINK-31065 Also as for split assigner . Best regards, Yuxia ----- 原始邮件 ----- 发件人: "Etienne Chauchot" <echauc...@apache.org> 收件人: "dev" <dev@flink.apache.org> 抄送: "Chesnay Schepler" <ches...@apache.org> 发送时间: 星期四, 2023年 3 月 30日 下午 10:36:39 主题: [blog article] Howto create a batch source with the new Source framework Hi all, After creating the Cassandra source connector (thanks Chesnay for the review!), I wrote a blog article about how to create a batch source with the new Source framework [1]. It gives field feedback on how to implement the different components. I felt it could be useful to people interested in contributing or migrating connectors. => Can you give me your opinion ? => I think it could be useful to post the article to Flink official blog also if you agree. => Same remark on my previous article [2]: what about publishing it to Flink official blog ? [1]https://echauchot.blogspot.com/2023/03/flink-howto-create-batch-source-with.html [2]https://echauchot.blogspot.com/2022/11/flink-howto-migrate-real-life-batch.html Best Etienne