Dear Apache Incubator Community, 
We propose to contribute Chunjun as an Apache Incubator project.
We are still looking for possible Champion and Mentors if anyone would like to 
volunteer. Thanks a lot.
Best Regards, 
Real-time computing engine team of DTStack.

#Chunjun Proposal

##Abstract
Chunjun is a distributed ETL tool and distributed data integration tool. 
Currently Chunjun is based on Apache Flink. It was initially known as FlinkX 
and renamed Chunjun on February 22, 2022.
- Chunjun codebase: https://github.com/DTStack/chunjun

##Proposal
We propose to contribute the Chunjun codebase to the Apache Software Foundation 
with the intent of forming a productive, meritocratic and open community around 
Chunjun’s continued development, according to the 'Apache Way'. The Chunjun's 
source code is already under the Apache License Version 2.0.

##Background
We developed Chunjun in DTStack company in 2017, when we needed a low-code 
development and high-performance data integration tool. It has been an 
open-source project on GitHub since April 2018. Chunjun is running in DTStack 
production environment all the time. Chunjun has also been widely used by 
companies in China, including DTStack (https://www.dtstack.com/), 
Qihu360(https://www.360.cn/), Iflytek (https://www.iflytek.com/), XPeng Motors 
(https://en.xiaopeng.com/), WeBank (https://www.webank.com/), 
Asiainfo(https://asiainfo.com/), Guazi(https://www.guazi.com/), Hello Inc 
(https://www.hello-inc.com/), etc.  Nowadays, Chunjun has a strong community in 
China. 

##Rationale
High-performance of Chunjun is based on Apache Flink, and Chunjun can integrate 
data from different data source. Users only need to configure a JSON file to 
complete the data reading, transformer, and writing. Users can implement new 
reader/writer plugins to meet their requirements. Chunjun have implemented 
plugins that can capture data change for MySQL to restore data for Apache Doris.
Chunjun has the following feature: 
real-time and offline integrate data from different data sources. 
change data capture(CDC) to merge restore data.
resume from broken-point.
capture and collect dirty data.
limit data transferring rate.
thoughput metrics.
capture and restore schema evolution. (TODO)

##Current Status###Meritocracy
Since Chunjun was open-sourced, many enterprises have adopted Chunjun to build 
up their data integration system. In return, we have received many issue 
reports or enhancements from them simultaneously. The codebase is now mainly 
managed by the development team inside DTStack who's responsible for building 
internal data integration system too.###Community
Chunjun has been building a community around contributors and users to this 
framework for the last five years. We organized one meetup in 2020. Currently, 
we communicate in Github issues and in chinese DingTalk group. There are about 
3000 people in this group. And we believe that we can get a lot of help from 
the Apache Flink community too. We will organize a meetup again in 2022.###Core 
Developers
(In alphabetical order) 
Chao Xu (https://github.com/zoudaokoulife)
Gongjiang Tang, (https://github.com/kyo-tom)
Huai Yang, (https://github.com/yanghuaiGit)
Jiangbo Li, (https://github.com/lijiangbo)
Luning Wong, (https://github.com/deadwind4)
Luo Li, (https://github.com/kanata163)
Sishu Yang, (https://github.com/yangsishu) 
Tianzhu Wen, (https://github.com/WTZ468071157)
Weiliang Hao, (https://github.com/xiuzhu9527)
Wenqiang Liu, (https://github.com/meng1222)
Xing Liu, (https://github.com/simenliuxing)
Yang Lan, (https://github.com/HiLany)
Yanquan Lv, (https://github.com/lvyanquan)
Yifan Hu, (https://github.com/demotto)
Zaiyue Yu, (https://github.com/tonybobam)
Zhangwan Zhao, (https://github.com/jiemotongxue)
Zhiqiang Li, (https://github.com/ChestnutQiang) 
They are almost working in real-time computing engine team of DTStack. Only 
Yifan Hu working for CaoCao Tech. Most of them are Apache Flink contributor.

##Known Risks###Project Name
The name of the project is Chunjun. Chunjun comes from mandarin chinese Pinyin 
"Chun Jun", and it is one of the top ten famous swords in China.###Orphaned 
products
More than 20 contributors and thousands of forks and star further show that 
Chunjun is actively supported, and we seek to further prosper the community 
with the aid of Apache. As a consequence, Chunjun is unlikely to be reduced to 
an orphaned project.###Inexperience with Open Source
Many of the Chunjun committers have experience working on open source projects. 
They are also active contributors to other Apache projects.
###Homogenous Developers 
The most of core developers are from DTStack, and Chunjun received some bug 
fixes and enhancements from other developers not working at DTStack. 
###Reliance on Salaried Developers
Currently, most of core developers are paid to work on Chunjun project by 
DTStack. We look forward to attracting more people outside DTStack to join this 
project.###Relationships with Other Apache Products
We have integrated with Apache Flink, Apache Hadoop, Apache Common and Apache 
HttpComponents, Log4J and Maven.
Usage of Apache projects related to Chunjun plugin
Apache Hive
Apache Solr
Apache Doris
Apache HBase
Apache Kudu
Apache Kafka
Apache Pulsar (TODO)###An Excessive Fascination with the Apache Brand
We acknowledge the value and reputation that the Apache brand would bring to 
Chunjun. However, our primary interest is in the excellent community provided 
by Apache Software Foundation, in which all the projects could gain stability 
for long-term development.

##Documentation
A complete set of documents is provided on GitHub, including English and 
Simplified Chinese versions.
English: https://github.com/DTStack/chunjun/blob/master/README.md
Chinese: https://github.com/DTStack/chunjun/blob/master/README_CH.md

##Initial Code
https://github.com/DTStack/chunjun

##Initial Source and Intellectual Property Submission Plan
The codebase is already licensed under the Apache License 2.0 and the copyright 
is assigned to DTStack. If the project enters incubator, DTStack will transfer 
the source code & trademark ownership to ASF via a Software Grant Agreement. 
Our initial committers will submit iCLA(s), SGA, and CCLA(s). ##External 
DependenciesApache-2.0 licenses
Apache Avro
Apache Commons
Apache Curator
Apache Flink
Apache Hadoop
Apache HttpComponents
Apache Log4j
Gson
Guava
Jackson
Powermock
PrometheusEclipse Distribution License
JUnitEPL licenses
LogbackMIT licenses
Mockito
SLF4J

##Required Resources ###Git Repositories
https://github.com/apache/incubator-chunjun###Issue Tracking 
The community would like to continue using GitHub Issues.###Mailing List 
priv...@chunjun.incubator.apache.org
d...@chunjun.incubator.apache.org
comm...@chunjun.incubator.apache.org###Continuous Integration tool
GitHub Action

##Initial Committers 
(In alphabetical order) 
Chao Xu (https://github.com/zoudaokoulife, xuchao at dtstack dot com)
Luning Wong (https://github.com/deadwind4, gfeng48 at gmail dot com)
Sishu Yang (https://github.com/yangsishu, sishu at dtstack dot com)
Yang Huai (https://github.com/yanghuaiGit, dujie at dtstack dot com)
Zhiqiang Li (https://github.com/ChestnutQiang, wujuan at dtstack dot com)


##Affiliations 
The initial committers are employees of DTStack. The nominated mentors and 
champion are employees of TODO.

##Sponsors 

###Champion 
TODO ###Nominated Mentors 
TODO 

Reply via email to