+1, it makes sense to me.

Thanks,
Jiaqi Shen


Yu <li...@apache.org> 于2022年12月19日周一 20:57写道:

> Hi tison,
>
> Thanks for raising this up!
>
> Our community had a similar discussion previously and chose to "keep" the
> doc repo stay in the Pulsar main repo at that time.
>
> [1] lists the pros and cons of "keep" and "not keep" solutions.
>
> I'm +0 on this proposal because I think the total scores of these two
> solutions are almost equal after weighing the pros and cons.
>
> ~~~~~~~~~~~~~~~~~~~~
>
> [1] https://lists.apache.org/thread/mf2xwntfgn84dq78ksqv22jk3drq6xb3
>
>
> On Mon, Dec 19, 2022 at 5:40 PM tison <wander4...@gmail.com> wrote:
>
> > Thanks for your feedback!
> >
> > @Asaf
> >
> > > pre-commit
> >
> > I mean CI checks before merging a patch. Currently, we don't run checks
> for
> > the content before merging them. This causes a series of syntax errors
> and
> > broken links issues. If we hold docs under site2 folder in the main repo
> > and then copied to the site repo, we have two places to build such CI
> > checks. What's worse, the checks for the main repo will be quite
> > cumbersome (that you do some if-else logic in the whole Pulsar CI
> > workflows, and do the sync sequentially in that workflow).
> >
> > If we hold the source of docs only in the site repo, we can extend the
> > "precommit" workflow[1] I added recently to check for syntax errors and
> > broken links also.
> >
> > > What does the apache/pulsar-site repo contain today?
> >
> > It should be covered by the documentation guide page[2]. It holds the
> > source of the official website and the user docs are synced from the main
> > repo.
> >
> > > What content do we have today in the pulsar repo related to the site?
> >
> > After issue-18014[3] is done, we host only user docs and some JSON
> metadata
> > in the main repo, which is synced by site_syncer.py[4].
> >
> > > Can you explain that better? Are you saying pulsar source JARs contain
> > the documentation?
> >
> > No. Source JARs contain only the Java files and necessary copyrights
> info.
> > The source release is, for example,
> >
> >
> https://archive.apache.org/dist/pulsar/pulsar-2.10.2/apache-pulsar-2.10.2-src.tar.gz
> > ,
> > which is extracted to 173M where 129M is occupied by the site2 folder.
> >
> > This also affects when developers do git clone to clone the repo.
> >
> > > I mean, if you wish to document a bug fix in 2.9.x, for example, would
> > you do it in the 2.9.x branch under site2/docs or
> > site2/website/versioned_docs/2.9.5?
> >
> > This is another question. Ideally, we should have hosted versioned docs
> > associated with the specific version to that branch, like Apache Flink
> does
> > as I mentioned[5]. But we do not, and actually the situation is we update
> > the versioned docs under the master branch and thus, the docs can be
> synced
> > properly.
> >
> > See also the "Alternatives" section in the original email.
> >
> > @All
> >
> > Since we don't have objections to the possible cons listed above or any
> new
> > ones, I'm going to create a tracking issue later this week and show what
> > will be changed in PRs for further review.
> >
> > Best,
> > tison.
> >
> > [1]
> >
> >
> https://github.com/apache/pulsar-site/blob/f7abc615d57d9846ed093922d24bff952dc0e838/.github/workflows/ci-precommit.yml
> > [2]
> >
> >
> https://pulsar.apache.org/contribute/document-contribution/#source-repositories
> > [3] https://github.com/apache/pulsar/issues/18014
> > [4]
> >
> >
> https://github.com/apache/pulsar-site/blob/f7abc615d57d9846ed093922d24bff952dc0e838/tools/pytools/lib/execute/site_syncer.py
> > [5] https://github.com/apache/flink/tree/master/docs
> >
> >
> > PengHui Li <peng...@apache.org> 于2022年12月19日周一 16:26写道:
> >
> > > +1
> > >
> > > I support moving them to the website repo.
> > >
> > > Thanks,
> > > Penghui
> > >
> > > On Mon, Dec 19, 2022 at 12:04 PM Yunze Xu <y...@streamnative.io.invalid
> >
> > > wrote:
> > >
> > > > +1. The most significant point to me is that we can preview all the
> > > > content of the website without synchronizing contents from the
> > > > apache/pulsar repo.
> > > >
> > > > Thanks,
> > > > Yunze
> > > >
> > > > On Mon, Dec 19, 2022 at 9:53 AM Li Li <urf...@apache.org> wrote:
> > > > >
> > > > > +1, That’s a good idea.
> > > > >
> > > > > > On Dec 16, 2022, at 07:07, tison <wander4...@gmail.com> wrote:
> > > > > >
> > > > > > Hi,
> > > > > >
> > > > > > After several works around the build flow of our official
> > > > website[1][2][3],
> > > > > > the content sync and site build flow is debuggable and
> reproducible
> > > > now.
> > > > > >
> > > > > > However, compared to other Apache projects' websites' project
> > layouts
> > > > and
> > > > > > workflow, we still meet two challenges on the Pulsar site:
> > > > > >
> > > > > > 1. We don't have a pre-commit workflow for any website-related
> > > changes.
> > > > > > Thus, we don't detect broken links or syntax errors when
> reviewing
> > > new
> > > > > > patches[4][5][6].
> > > > > > 2. The website's content is two-level down in
> `site2/website-next`
> > > for
> > > > > > historical reasons, which is confusing for contributors.
> > > > > >
> > > > > > To overcome these two shortcomings, I propose the following:
> > > > > >
> > > > > > 1. Move the website's content to the root level, then we have a
> > > > first-class
> > > > > > Docu&yarn-based JS project layout. It's more convenient and
> > familiar
> > > to
> > > > > > related developers.
> > > > > > 2. Host the source of docs in the site repo (apache/pulsar-site)
> > > > instead of
> > > > > > under `site2` folder in the main repo and do content sync.
> > > > > >
> > > > > > Below are the pros and cons:
> > > > > >
> > > > > > Pros
> > > > > >
> > > > > > 1. Obviously, we have the pre-commit workflow now. And since we
> > host
> > > > the
> > > > > > source of docs in one repo, we don't have to run the pre-commit
> > > > workflow in
> > > > > > two places, which can be quite cumbersome to implement.
> > > > > > 2. The size of the source release of the main repo can be
> reduced a
> > > > lot.
> > > > > > Currently, 63MB out of 140MB of the sources are taken by the
> site2
> > > > folder,
> > > > > > which we can remove totally. In addition, we carry out
> > full-versioned
> > > > docs
> > > > > > every release.
> > > > > > 3. We can clean up a large portion of "integration" to debug the
> > site
> > > > > > brittlely on the main repo[7]  (etc.) and redundant contribution
> > > > guide[8].
> > > > > > This way, when updating docs, we can preview the result in one
> repo
> > > > instead
> > > > > > of actually doing the sync on the fly. In addition, this
> > integration
> > > > blocks
> > > > > > we move the website content to the top level since it makes
> strong
> > > > > > assumptions about the relative layout.
> > > > > >
> > > > > > Cons
> > > > > >
> > > > > > The most significant con is that we cannot update the code and
> docs
> > > in
> > > > one
> > > > > > patch against apache/pulsar now. You must open a new pull request
> > to
> > > > > > apache/pulsar-site, cross-reference each other and manage the
> merge
> > > > order
> > > > > > (synchronization).
> > > > > >
> > > > > > Alternatives:
> > > > > >
> > > > > > To resolve the versioned docs issue, an alternative is to host
> only
> > > the
> > > > > > user docs along with each version, like Flink does[9]. But it
> both
> > > > detaches
> > > > > > from the Docu framework and requires significant development
> > efforts.
> > > > > >
> > > > > > Since it can explicitly change the development flow (that is, you
> > > > should
> > > > > > now update docs separately), I am starting this discussion here
> to
> > > > reach
> > > > > > for your feedback.
> > > > > >
> > > > > > Welcome to leave your comments!
> > > > > >
> > > > > > Best,
> > > > > > tison.
> > > > > >
> > > > > > [1] https://pulsar.apache.org/
> > > > > > [2] https://github.com/apache/pulsar-site
> > > > > > [3] https://github.com/apache/pulsar/issues/18014
> > > > > > [4] https://github.com/apache/pulsar/issues/17599
> > > > > > [5]
> > > https://github.com/apache/pulsar/pull/17863#discussion_r990174850
> > > > > > [6]
> > > https://github.com/apache/pulsar/pull/17853#discussion_r991803704
> > > > > > [7]
> > > > > >
> > > >
> > >
> >
> https://github.com/apache/pulsar/blob/b1f9e351fa4d5aba197d33cfc0c536516b55b61f/site2/website/start.sh
> > > > > > [8]
> > > > > >
> > > >
> > >
> >
> https://pulsar.apache.org/contribute/document-preview/#preview-documentation-changes
> > > > > > [9] https://github.com/apache/flink/tree/master/docs
> > > > >
> > > >
> > >
> >
>

Reply via email to