This looks very interesting.  I've used Redis a long time and Pegasus looks
very interesting.

I'd like to see a champion and some mentors but otherwise I really like
what I see here.

Regards,
KAM
--
Kevin A. McGrail
Member, Apache Software Foundation
Chair Emeritus Apache SpamAssassin Project
https://www.linkedin.com/in/kmcgrail - 703.798.0171


On Tue, Jun 2, 2020 at 3:49 AM 吴涛 <wutao.as.nevercha...@gmail.com> wrote:

> Dear Apache Incubator Community,
>
> I'd like to open up a discussion about incubating Pegasus at Apache. Our
> proposal can be found at https://pegasus-kv.github.io/community/proposal
> and is also included below.
>
> We are looking for possible Champion if anyone would like to volunteer.
> Thanks a lot!
>
> Best regards
>   Tao Wu
>
> Pegasus Proposal
>
> == Abstract ==
>
> Pegasus is a distributed key-value storage system that is designed to be
> horizontally scalable, strongly consistent and high-performance.
>
> - Pegasus codebase: https://github.com/XiaoMi/pegasus
> - Website: https://pegasus-kv.github.io
>
> == Proposal ==
>
> Pegasus is a key-value database that delivers low-latency data access
> together with horizontal scalability, using hash-based partitioning.
> Pegasus uses PacificA protocol for strong consistency and RocksDB as the
> underlying storage engine.
>
> We propose to contribute the Pegasus codebase and associated artifacts
> (e.g., documentation, website content, etc.) to the Apache Software
> Foundation, and aim to build an open community around Pegasus’s continued
> development in the ‘Apache Way’.
>
> == Background ==
>
> Apache HBase was recognized as mostly the only large-scale KV store
> solution in XiaoMi Corp until Pegasus came out in 2015. The original
> purpose of Pegasus was to solve the problems caused by HBase’s two-level
> architecture and implementation, including high latency because of Java GC
> and RPC overhead of the underlying distributed filesystem, and long
> failover time because of single point of RegionServer and recovery overhead
> of splitting and replaying the HLog files.
>
> Pegasus aims to fill the gap between Redis and HBase. As the former is
> in-memory, low latency, but does not provide a strong-consistency
> guarantee. And unlike the latter, Pegasus server is entirely written in C++
> and its read-write path relies merely on the local filesystem.
>
> Apart from performance requirements, we also need a storage system to
> ensure multiple-level data safety and support fast data migration among
> data centers, automatic load balancing, and online partition splitting.
>
> After investigating lots of existing storage systems in the open source
> world, we could hardly find a suitable solution to satisfy all the
> requirements. So the journey of Pegasus begins.
>
> === Rationale ===
>
> Pegasus is a mature and active project which has been widely adopted in
> XiaoMi. After the initial release of open source project in 2017, we have
> seen a great amount of interest across a diverse set of users and companies.
>
> Our experiences at committers and PMC members on other Apache projects
> have convinced us that having a long-term home at Apache foundation would
> be a great fit for the project, to ensure that processes and procedures are
> in place to keep project and community ‘healthy’ and free of any
> commercial, political or legal faults.
>
> === Initial Goal ===
>
> Move the existing codebase, website, documentation, and mailing lists to
> Apache-hosted infrastructure.
> Work with the infrastructure team to implement and approve our code
> review, build, and testing workflows in the context of the ASF.
> Incremental development and releases along with Apache guidelines.
>
> == Current Status ==
>
> Pegasus has been an open-source project on GitHub
> https://github.com/XiaoMi/pegasus since October 2017.
>
> === Meritocracy ===
>
> The intent of this proposal is to start building a diverse developer and
> user community around Pegasus following the ASF meritocracy model. We plan
> to invite more people as committers if they contribute to this project.
>
> === Releases ===
>
> Pegasus has undergone multiple public releases, listed here:
> https://github.com/XiaoMi/pegasus/releases.
>
> These old releases were not performed in the typical ASF fashion. We will
> adopt the ASF source release process upon joining the incubator.
>
> === Code Reviews ===
>
> Pegasus’s code reviews are currently public on Github
> https://github.com/XiaoMi/pegasus/pulls.
>
> === Community ===
>
> Pegasus seeks to develop developer and user communities during incubation.
>
> === Core Developers ===
>
> Currently most of the core developers of Pegasus are working in the
> KV-Storage Team of Xiaomi. Yingchun Lai is one of the Apache Kudu PMC
> members. Zuoyan Qin is an experienced open-source developer who created
> sofa-pbrpc in his last job in Baidu. Wei Huang is also an active
> contributor of Apache Doris (Incubating).
>
> - Zuoyan Qin (https://github.com/qinzuoyan)
> - Yuchen He (https://github.com/hycdong)
> - Tao Wu (https://github.com/neverchanje)
> - Yingchun Lai (https://github.com/acelyc111)
> - Wei Huang (https://github.com/vagetablechicken)
> - Shuo Jia (https://github.com/Shuo-Jia)
> - Liwei Zhao (https://github.com/levy5307)
>
> === Alignment ===
>
> Pegasus is aligned with several other ASF projects.
>
> We are working on a new feature to load data from the HDFS filesystem.
> Pegasus can also generate and store checkpoints to HDFS, for both backup
> and analysis purpose. We currently support offline analysis on checkpoints
> powered by Apache Spark.
>
> == Known Risks ==
>
> === Orphaned Products ===
>
> The core developers of XiaoMi’s Pegasus team work full time on this
> project. There is very little risk of Pegasus getting orphaned since at
> least one large company (XiaoMi) is extensively using it in production,
> with currently a scale of 70+ clusters, 800+ tables, and more than 70TB
> data. Furthermore, since Pegasus was open sourced at the beginning of
> October 2017, it has received more than 1200 stars and been forked more
> than 200 times, and also received some issues and pull requests from
> developers and users outside XiaoMi. We plan to extend and diversify this
> community further through Apache.
>
> === Inexperience with Open Source ===
>
> The core developers are all active users and followers of open source.
> They are already committers and contributors to the Pegasus Github project.
> All have been involved with the source code that has been released under an
> open source license, and several of them also have experience developing
> code in an open source environment.
>
> Several of the developers in XiaoMi’s storage team are committers and/or
> PMC members on other ASF projects (Kudu, HBase, Doris, etc.). They will
> guide others to practice the Apache Way together along with other incubator
> mentors.
>
> === Homogenous Developers ===
>
> The project has received some contributions from developers outside of
> XiaoMi, and is starting to attract a user community as well. We hope to
> continue to encourage contributions from these developers and community
> members, and grow them into committers as they have time to continue their
> contributions.
>
> === Reliance on Salaried Developers ===
>
> XiaoMi invested in Pegasus as a general key-value storage used in company
> widely. The core developers have been dedicated to this project for nearly
> five years.
>
> Besides, we look forward to attracting more people outside XiaoMi to
> contribute to this project, either payed engineers working on storage area,
> or individual volunteers, as long as they have enthusiasm for the Pegasus
> project.
>
> === An Excessive Fascination with the Apache Brand ===
>
> Pegasus is proposing to enter incubation at Apache in order to help
> efforts to diversify the committer-base, not so much to capitalize on the
> Apache brand. The Pegasus project is in production use already inside
> XiaoMi, but is not expected to be a XiaoMi product for external customers.
> As such, the Pegasus project is not seeking to use the Apache brand as a
> marketing tool.
>
> == Documentation ==
>
> Information about Pegasus can be found at
> https://github.com/XiaoMi/pegasus. The following links provide more
> information about Pegasus in open source:
>
> - Pegasus Website: https://pegasus-kv.github.io
> - Codebase at Github: https://github.com/XiaoMi/pegasus
> - Issue Tracking: https://github.com/XiaoMi/pegasus/issues
> - Releases: https://pegasus-kv.github.io/releases
> - Community Guide: https://pegasus-kv.github.io/community
>
> == Initial Source ==
>
> Besides the core codebase, Pegasus also hosts its side projects on Github
> under XiaoMi Group. Specifically, the initial source includes:
>
> Client libraries with different languages:
>
> - Java-Client: https://github.com/XiaoMi/pegasus-java-client
> - Scala-Client: https://github.com/XiaoMi/pegasus-scala-client
> - NodeJS-Client: https://github.com/XiaoMi/pegasus-nodejs-client
> - Go-Client: https://github.com/XiaoMi/pegasus-go-client
> - Python-Client: https://github.com/XiaoMi/pegasus-python-client
>
> Components of Pegasus:
>
> - rDSN: https://github.com/XiaoMi/rdsn
> - RocksDB: https://github.com/XiaoMi/pegasus-rocksdb
>
> rDSN was initially a distributed framework developed by Zhenyu Guo from
> Microsoft, and we have heavily refactored and improved it to make it more
> fit for Pegasus. rDSN is MIT & Apache-2.0 dual-licensed. The code licensed
> Apache-2.0 belongs to XiaoMi and the copyright of MIT-licensed code is
> assigned to Microsoft. It’s in our plan to merge Pegasus and rDSN as one
> project.
>
> RocksDB is a Facebook-developed storage engine. Pegasus added some
> enhancements and modifications that may be incompatible with the original
> implementation. RocksDB is licensed under Apache 2.0 License.
>
> == External Dependencies ==
>
> Pegasus has the following external dependencies.
>
> - RocksDB (Apache)
> - Apache Thrift (Apache Software License v2.0)
> - Boost (Boost Software License)
> - Apache Zookeeper (Apache)
> - Google s2geometry (BSD)
> - Google gflags (BSD)
> - fmtlib (BSD)
> - POCO (Boost Software License)
> - rapidjson (Tencent)
> - libevent (BSD)
> - Google gperftools (BSD)
> - cameron314/concurrentqueue (BSD)
> - cameron314/readerwriterqueue (BSD)
> - XiaoMi/galaxy-fds-sdk-cpp (No License)
> - jupp0r/prometheus-cpp (MIT)
> - curl (The curl license)
> - nlohmann/json (MIT)
> - abseil-cpp (Apache 2.0)
> - antirez/linenoise (BSD-2)
> - antirez/sds (BSD-2)
>
> Build and test dependencies:
>
> - Apache Maven (Apache Software License v2.0)
> - cmake (BSD)
> - Google gtest (Apache Software License v2.0)
>
> == Required Resources ==
>
> === Mailing List ===
>
> There are currently no mailing lists. The usual mailing lists are expected
> to be set up when entering incubation:
>
> - priv...@pegasus.incubator.apache.org
> - d...@pegasus.incubator.apache.org
> - comm...@pegasus.incubator.apache.org
>
> === Git Repositories ===
>
> Upon entering incubation, we want to move the existing repository from
> https://github.com/XiaoMi/pegasus to Apache infrastructure like
> https://github.com/apache/incubator-pegasus.
>
> === Issue Tracking ===
>
> Pegasus currently uses Github to track issues. Would like to continue to
> do so while we discuss migration possibilities with the ASF Infra committee.
>
> === Other Resources ====
>
> The existing code already has unit tests so we will make use of existing
> Apache continuous testing infrastructure. The resulting load should not be
> very large.
>
> == Source and Intellectual Property Submission Plan ==
>
> Most of the current code is Apache 2.0 licensed and the copyright is
> assigned to XiaoMi. If the project enters incubator, XiaoMi will transfer
> the source code & trademark ownership to ASF via a Software Grant Agreement.
>
> But due to historical issues, Pegasus was based on an MIT licensed code
> that was initially written by microsoft/rDSN, which has long been actively
> developed by Pegasus because the original project is unmaintained (modified
> code is licensed under Apache License 2.0). We aren’t sure if we should
> request Microsoft for any Contributor License Agreement (CLA) during the IP
> clearance process.
>
> == Initial Committers ==
>
> - Zuoyan Qin (https://github.com/qinzuoyan, qinzuoyan@xiaomi dot com)
> - Weijie Sun (https://github.com/shengofsun, luckyweijie@gmail dot com)
> - Yuchen He (https://github.com/hycdong, heyuchen@xiaomi dot com)
> - Tao Wu (https://github.com/neverchanje, wutao1@xiaomi dot com)
> - Yingchun Lai (https://github.com/acelyc111, laiyingchun@xiaomi dot com)
> - Wei Huang (https://github.com/vagetablechicken, huangwei5@xiaomi dot
> com)
> - Shuo Jia (https://github.com/Shuo-Jia, jiashuo1@xiaomi dot com)
> - Liwei Zhao (https://github.com/levy5307, zhaoliwei@xiaomi dot com)
> - Liuyang Cai (https://github.com/LoveHeat)
>
> == Affiliations ==
>
> Seven of the initial committers are employees of Xiaomi.
>
> == Sponsors ==
>
> === Champion ===
>
> TODO
>
> === Nominated Mentors ===
>
> TODO
>
> === Sponsoring Entity ===
>
> We are requesting the Incubator to sponsor this project.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> For additional commands, e-mail: general-h...@incubator.apache.org
>
>

Reply via email to