One more thing to mention here: Currently, Pulsar Docker Image bundles C++ client and Python client, and from my perspective, the image is mainly used as a server, perhaps we can remove these clients from bundling.
Best, tison. Yunze Xu <y...@streamnative.io.invalid> 于2022年9月20日周二 16:50写道: > Hi Tison, > > Sorry I just missed that. Thanks for your reminder. > > Thanks, > Yunze > > > > > > On Sep 20, 2022, at 16:35, tison <wander4...@gmail.com> wrote: > > > > Hi Yunze, > > > >> Just wondering if there is a way to retain the git history in the > > pulsar-client-cpp directory? > > > > Matteo's proposal already write: > > > >> git filter-repo --subdirectory-filter pulsar-client-cpp > > > > So you will retain the git history. > > > > Best, > > tison. > > > > > > Yunze Xu <y...@streamnative.io.invalid> 于2022年9月20日周二 16:27写道: > > > >> LGTM. I also listed the related files outside the pulsar-client-cpp > >> directory recently: > >> > >> - pulsar-common/src/main/proto/PulsarApi.proto: the Pulsar binary > >> proto file > >> - src/gen-pulsar-version-macro.py: generate the internal version info > >> - pulsar-client/src/test/proto/*.proto: test the protobuf native > >> schema feature > >> > >> It would not be a complicated job for that. Just wondering if there is > >> a way to retain the git history in the pulsar-client-cpp directory? > >> > >> Thanks, > >> Yunze > >> > >> > >> > >> > >>> On Sep 20, 2022, at 07:25, Matteo Merli <mme...@apache.org> wrote: > >>> > >>> https://github.com/apache/pulsar/issues/17724 > >>> > >>> > >>> > >>> ## Motivation > >>> > >>> Pulsar C++ code base is in the same main repository for the Pulsar > >> project. > >>> > >>> While the decision was the right one at the time, there is a > >>> considerable overhead > >>> in keeping the C++ client in its current position. > >>> > >>> ### Issues with the current approach > >>> > >>> The Pulsar repository has grown a lot in size and number of active > >> developers. > >>> > >>> 1. The frequency of changes in various parts of the codebase has > >> increased to a > >>> point where the amount of resources dedicated to CI is very > >> significant. > >>> > >>> Every change in Java code will trigger the CI jobs for the C++ > >>> client and every > >>> change in the C++ client will do the same. > >>> > >>> During a CI job we are building the C++ client multiple times: > >>> 1. For C++ and Python client tests > >>> 2. To build Python wheels to be included in the pulsar Docker > >>> images (for supporting > >>> Pulsar functions) > >>> > >>> 2. The release process for Pulsar has become very complex and > >>> requires building a > >>> large number of binaries for C++ and Python clients. This has > >>> become too much > >>> of a burden during the course of a Pulsar release. > >>> > >>> > >>> ## Goal > >>> > >>> Decouple the development of C++ and Python client libraries from the > >> development > >>> of the core components of Pulsar in Java. > >>> > >>> > >>> ## Changes > >>> > >>> ### Repositories > >>> > >>> 1. Move the C++ client code to a new repository > >>> `github.com/apache/pulsar-client-c++` > <http://github.com/apache/pulsar-client-c++> > >> <http://github.com/apache/pulsar-client-c++> > >>> 2. Move the Python client code to a new repository > >>> `github.com/apache/pulsar-client-python` > <http://github.com/apache/pulsar-client-python> > >> <http://github.com/apache/pulsar-client-python> > >>> > >>> The change will be done without losing any history, extracting a > >>> sub-directory into > >>> a new Git repository. > >>> > >>> ``` > >>> git filter-repo --subdirectory-filter pulsar-client-cpp > >>> ``` > >>> > >>> ### Release process > >>> > >>> The release process will be split in multiple parts: > >>> > >>> 1. the main Pulsar release will only contain the Java parts (server > >>> distribution > >>> and Java client library) > >>> 2. The C++ client will have its own release schedule and versioning > >>> 3. The Python client will have its own release schedule and versioning > >>> > >>> #### Versioning > >>> > >>> Both C++ and Python clients will continue with their own individual > >> versioning. > >>> > >>> In order to not break anything or cause more confusion, we would need > to > >> use > >>> a new version that is bigger than the current version (2.11.x). > >>> > >>> The suggestion is to start the new releases for both C++ and Python > from > >> 3.0.0. > >>> > >>> > >>> #### Existing branches > >>> > >>> Existing branches of Pulsar, where the C++ client will still be in the > >> same main > >>> the repository and will be receiving bug fixes in their current > location. > >>> > >>> The different location of the new C++ code will make the cherry-picking > >> process > >>> slightly more painful in the short term, though it will even out in > long > >> term. > >>> > >>> > >>> ### Projects dependencies > >>> > >>> #### C++/Python --> Pulsar > >>> > >>> Both C++ and Python unit/integration tests are designed to run against > >>> a standalone > >>> instance of Pulsar broker. In the current form, they're using the > >> `master` code > >>> that is built to run the tests. > >>> > >>> After the split, the unit tests will use a Docker image of Pulsar. We > >>> can use a few > >>> different images to test for compatibility > >>> 1. Latest stable (eg: 2.10.1) > >>> 2. Nightly (Pulsar Docker image published every day from master branch) > >>> > >>> #### Pulsar --> Python > >>> > >>> To create a Pulsar image, we are now building the Python client wheel > >>> file and then > >>> installing it at build time. > >>> > >>> Instead, we are going to include a wheel file for a version of the > >> Python client > >>> that has been already released. > >>> > >>> #### Python --> C++ > >>> > >>> The Python client library is just a wrapper on top of the C++ client. > >>> Today these > >>> are built together, with Python wrapper code residing in a > >>> sub-directory of C++ client > >>> code, and compiled using the same CMake build script. > >>> > >>> By separating the Python client into a different repository, we are > >> going to > >>> depend on an already released version of the C++ client. > >>> > >>> > >>> #### Automated documentation in the website > >>> > >>> On the Pulsar website we are auto-generating C++ documentation with the > >> Doxygen > >>> tool and the Python one with Pdoc. > >>> > >>> Instead of just fetching the main repo code, the website build job > >> should be > >>> also fetching the new repos to run the tooling. > >>> > >>> > >>> > >>> > >>> > >>> > >>> -- > >>> Matteo Merli > >>> <mme...@apache.org> > >> > >> > >