> > We don't have a pre-commit workflow for any website-related changes. > Thus, we don't detect broken links or syntax errors when reviewing new > patches[4][5][6].
When you say "pre-commit work-flow,” do you mean what exactly? Usually, a pre-commit is something you run locally upon committing. Do you mean this? When you say “new patches,” - do you mean content? Code? Or documentation? 1. Move the website's content to the root level, then we have a first-class > Docu&yarn-based JS project layout. It's more convenient and familiar to > related developers. 2. Host the source of docs in the site repo (apache/pulsar-site) instead of > under `site2` folder in the main repo and do content sync. Maybe you can give some context? What does the apache/pulsar-site repo contain today? What content do we have today in the pulsar repo related to the site? I see we have the docs and also the Javascript code. I would love some context here. 2. The size of the source release of the main repo can be reduced a lot. > Currently, 63MB out of 140MB of the sources are taken by the site2 folder, > which we can remove totally. In addition, we carry out full-versioned docs > every release. Can you explain that better? Are you saying pulsar source JARs contain the documentation? Cons > > The most significant con is that we cannot update the code and docs in one > patch against apache/pulsar now. You must open a new pull request to > apache/pulsar-site, cross-reference each other and manage the merge order > (synchronization). We can take, let's say, five features and see if they were actually done in the same PR or separate PR. I guess that most documentation is actually updated separately. Thus, from that perspective, maybe it’s not a con. Since we actually maintain separate directories per version, at least for the documentation, we lost the actual benefit of saying the code is in sync with the docs. I mean, if you wish to document a bug fix in 2.9.x, for example, would you do it in the 2.9.x branch under site2/docs or site2/website/versioned_docs/2.9.5? The site2/docs and site2/website/versioned_docs are super confusing, IMO. Why are the first called docs and the other actually under the website? I would prefer everything related to docs to have sub-folders like “previous_versions" and “next,” thus not separated as it is today. Thanks, Asaf On 16 Dec 2022 at 1:07:47, tison <wander4...@gmail.com> wrote: > Hi, > > After several works around the build flow of our official website[1][2][3], > the content sync and site build flow is debuggable and reproducible now. > > However, compared to other Apache projects' websites' project layouts and > workflow, we still meet two challenges on the Pulsar site: > > 1. We don't have a pre-commit workflow for any website-related changes. > Thus, we don't detect broken links or syntax errors when reviewing new > patches[4][5][6]. > 2. The website's content is two-level down in `site2/website-next` for > historical reasons, which is confusing for contributors. > > To overcome these two shortcomings, I propose the following: > > 1. Move the website's content to the root level, then we have a first-class > Docu&yarn-based JS project layout. It's more convenient and familiar to > related developers. > 2. Host the source of docs in the site repo (apache/pulsar-site) instead of > under `site2` folder in the main repo and do content sync. > > Below are the pros and cons: > > Pros > > 1. Obviously, we have the pre-commit workflow now. And since we host the > source of docs in one repo, we don't have to run the pre-commit workflow in > two places, which can be quite cumbersome to implement. > 2. The size of the source release of the main repo can be reduced a lot. > Currently, 63MB out of 140MB of the sources are taken by the site2 folder, > which we can remove totally. In addition, we carry out full-versioned docs > every release. > 3. We can clean up a large portion of "integration" to debug the site > brittlely on the main repo[7] (etc.) and redundant contribution guide[8]. > This way, when updating docs, we can preview the result in one repo instead > of actually doing the sync on the fly. In addition, this integration blocks > we move the website content to the top level since it makes strong > assumptions about the relative layout. > > Cons > > The most significant con is that we cannot update the code and docs in one > patch against apache/pulsar now. You must open a new pull request to > apache/pulsar-site, cross-reference each other and manage the merge order > (synchronization). > > Alternatives: > > To resolve the versioned docs issue, an alternative is to host only the > user docs along with each version, like Flink does[9]. But it both detaches > from the Docu framework and requires significant development efforts. > > Since it can explicitly change the development flow (that is, you should > now update docs separately), I am starting this discussion here to reach > for your feedback. > > Welcome to leave your comments! > > Best, > tison. > > [1] https://pulsar.apache.org/ > [2] https://github.com/apache/pulsar-site > [3] https://github.com/apache/pulsar/issues/18014 > [4] https://github.com/apache/pulsar/issues/17599 > [5] https://github.com/apache/pulsar/pull/17863#discussion_r990174850 > [6] https://github.com/apache/pulsar/pull/17853#discussion_r991803704 > [7] > > https://github.com/apache/pulsar/blob/b1f9e351fa4d5aba197d33cfc0c536516b55b61f/site2/website/start.sh > [8] > > https://pulsar.apache.org/contribute/document-preview/#preview-documentation-changes > [9] https://github.com/apache/flink/tree/master/docs >