Is it possible to come up with a compromised solution that has the pros of both sides but minimizes the side effect? I'm thinking maybe it's not necessary to sacrifice the current contribution process, as long as it can greatly reduce the load of back-end actions and source size. For example, if we only move out the versioned docs to the site repo but keep the source of the NEXT docs in the pulsar repo, does this help to win a large proportion of those pros when people can still contribute as usual?
________________________________ From: Jiaqi Shen <gleiphir2...@gmail.com> Sent: Tuesday, December 20, 2022 17:15 To: dev@pulsar.apache.org <dev@pulsar.apache.org> Subject: Re: [PROPOSAL] Website precommit and move the source of docs to the site repo +1, it makes sense to me. Thanks, Jiaqi Shen Yu <li...@apache.org> 于2022年12月19日周一 20:57写道: > Hi tison, > > Thanks for raising this up! > > Our community had a similar discussion previously and chose to "keep" the > doc repo stay in the Pulsar main repo at that time. > > [1] lists the pros and cons of "keep" and "not keep" solutions. > > I'm +0 on this proposal because I think the total scores of these two > solutions are almost equal after weighing the pros and cons. > > ~~~~~~~~~~~~~~~~~~~~ > > [1] https://lists.apache.org/thread/mf2xwntfgn84dq78ksqv22jk3drq6xb3 > > > On Mon, Dec 19, 2022 at 5:40 PM tison <wander4...@gmail.com> wrote: > > > Thanks for your feedback! > > > > @Asaf > > > > > pre-commit > > > > I mean CI checks before merging a patch. Currently, we don't run checks > for > > the content before merging them. This causes a series of syntax errors > and > > broken links issues. If we hold docs under site2 folder in the main repo > > and then copied to the site repo, we have two places to build such CI > > checks. What's worse, the checks for the main repo will be quite > > cumbersome (that you do some if-else logic in the whole Pulsar CI > > workflows, and do the sync sequentially in that workflow). > > > > If we hold the source of docs only in the site repo, we can extend the > > "precommit" workflow[1] I added recently to check for syntax errors and > > broken links also. > > > > > What does the apache/pulsar-site repo contain today? > > > > It should be covered by the documentation guide page[2]. It holds the > > source of the official website and the user docs are synced from the main > > repo. > > > > > What content do we have today in the pulsar repo related to the site? > > > > After issue-18014[3] is done, we host only user docs and some JSON > metadata > > in the main repo, which is synced by site_syncer.py[4]. > > > > > Can you explain that better? Are you saying pulsar source JARs contain > > the documentation? > > > > No. Source JARs contain only the Java files and necessary copyrights > info. > > The source release is, for example, > > > > > https://archive.apache.org/dist/pulsar/pulsar-2.10.2/apache-pulsar-2.10.2-src.tar.gz > > , > > which is extracted to 173M where 129M is occupied by the site2 folder. > > > > This also affects when developers do git clone to clone the repo. > > > > > I mean, if you wish to document a bug fix in 2.9.x, for example, would > > you do it in the 2.9.x branch under site2/docs or > > site2/website/versioned_docs/2.9.5? > > > > This is another question. Ideally, we should have hosted versioned docs > > associated with the specific version to that branch, like Apache Flink > does > > as I mentioned[5]. But we do not, and actually the situation is we update > > the versioned docs under the master branch and thus, the docs can be > synced > > properly. > > > > See also the "Alternatives" section in the original email. > > > > @All > > > > Since we don't have objections to the possible cons listed above or any > new > > ones, I'm going to create a tracking issue later this week and show what > > will be changed in PRs for further review. > > > > Best, > > tison. > > > > [1] > > > > > https://github.com/apache/pulsar-site/blob/f7abc615d57d9846ed093922d24bff952dc0e838/.github/workflows/ci-precommit.yml > > [2] > > > > > https://pulsar.apache.org/contribute/document-contribution/#source-repositories > > [3] https://github.com/apache/pulsar/issues/18014 > > [4] > > > > > https://github.com/apache/pulsar-site/blob/f7abc615d57d9846ed093922d24bff952dc0e838/tools/pytools/lib/execute/site_syncer.py > > [5] https://github.com/apache/flink/tree/master/docs > > > > > > PengHui Li <peng...@apache.org> 于2022年12月19日周一 16:26写道: > > > > > +1 > > > > > > I support moving them to the website repo. > > > > > > Thanks, > > > Penghui > > > > > > On Mon, Dec 19, 2022 at 12:04 PM Yunze Xu <y...@streamnative.io.invalid > > > > > wrote: > > > > > > > +1. The most significant point to me is that we can preview all the > > > > content of the website without synchronizing contents from the > > > > apache/pulsar repo. > > > > > > > > Thanks, > > > > Yunze > > > > > > > > On Mon, Dec 19, 2022 at 9:53 AM Li Li <urf...@apache.org> wrote: > > > > > > > > > > +1, That’s a good idea. > > > > > > > > > > > On Dec 16, 2022, at 07:07, tison <wander4...@gmail.com> wrote: > > > > > > > > > > > > Hi, > > > > > > > > > > > > After several works around the build flow of our official > > > > website[1][2][3], > > > > > > the content sync and site build flow is debuggable and > reproducible > > > > now. > > > > > > > > > > > > However, compared to other Apache projects' websites' project > > layouts > > > > and > > > > > > workflow, we still meet two challenges on the Pulsar site: > > > > > > > > > > > > 1. We don't have a pre-commit workflow for any website-related > > > changes. > > > > > > Thus, we don't detect broken links or syntax errors when > reviewing > > > new > > > > > > patches[4][5][6]. > > > > > > 2. The website's content is two-level down in > `site2/website-next` > > > for > > > > > > historical reasons, which is confusing for contributors. > > > > > > > > > > > > To overcome these two shortcomings, I propose the following: > > > > > > > > > > > > 1. Move the website's content to the root level, then we have a > > > > first-class > > > > > > Docu&yarn-based JS project layout. It's more convenient and > > familiar > > > to > > > > > > related developers. > > > > > > 2. Host the source of docs in the site repo (apache/pulsar-site) > > > > instead of > > > > > > under `site2` folder in the main repo and do content sync. > > > > > > > > > > > > Below are the pros and cons: > > > > > > > > > > > > Pros > > > > > > > > > > > > 1. Obviously, we have the pre-commit workflow now. And since we > > host > > > > the > > > > > > source of docs in one repo, we don't have to run the pre-commit > > > > workflow in > > > > > > two places, which can be quite cumbersome to implement. > > > > > > 2. The size of the source release of the main repo can be > reduced a > > > > lot. > > > > > > Currently, 63MB out of 140MB of the sources are taken by the > site2 > > > > folder, > > > > > > which we can remove totally. In addition, we carry out > > full-versioned > > > > docs > > > > > > every release. > > > > > > 3. We can clean up a large portion of "integration" to debug the > > site > > > > > > brittlely on the main repo[7] (etc.) and redundant contribution > > > > guide[8]. > > > > > > This way, when updating docs, we can preview the result in one > repo > > > > instead > > > > > > of actually doing the sync on the fly. In addition, this > > integration > > > > blocks > > > > > > we move the website content to the top level since it makes > strong > > > > > > assumptions about the relative layout. > > > > > > > > > > > > Cons > > > > > > > > > > > > The most significant con is that we cannot update the code and > docs > > > in > > > > one > > > > > > patch against apache/pulsar now. You must open a new pull request > > to > > > > > > apache/pulsar-site, cross-reference each other and manage the > merge > > > > order > > > > > > (synchronization). > > > > > > > > > > > > Alternatives: > > > > > > > > > > > > To resolve the versioned docs issue, an alternative is to host > only > > > the > > > > > > user docs along with each version, like Flink does[9]. But it > both > > > > detaches > > > > > > from the Docu framework and requires significant development > > efforts. > > > > > > > > > > > > Since it can explicitly change the development flow (that is, you > > > > should > > > > > > now update docs separately), I am starting this discussion here > to > > > > reach > > > > > > for your feedback. > > > > > > > > > > > > Welcome to leave your comments! > > > > > > > > > > > > Best, > > > > > > tison. > > > > > > > > > > > > [1] https://pulsar.apache.org/ > > > > > > [2] https://github.com/apache/pulsar-site > > > > > > [3] https://github.com/apache/pulsar/issues/18014 > > > > > > [4] https://github.com/apache/pulsar/issues/17599 > > > > > > [5] > > > https://github.com/apache/pulsar/pull/17863#discussion_r990174850 > > > > > > [6] > > > https://github.com/apache/pulsar/pull/17853#discussion_r991803704 > > > > > > [7] > > > > > > > > > > > > > > > > https://github.com/apache/pulsar/blob/b1f9e351fa4d5aba197d33cfc0c536516b55b61f/site2/website/start.sh > > > > > > [8] > > > > > > > > > > > > > > > > https://pulsar.apache.org/contribute/document-preview/#preview-documentation-changes > > > > > > [9] https://github.com/apache/flink/tree/master/docs > > > > > > > > > > > > > > >