The Pulsar website is getting published through a CI job that updates
the generated HTML files and commits them in the Pulsar repo, in a
separate branch ('asf-site'). From there the site is immediately
visible on the web.

One of the issues with this process is that we have a lot of updates
of generated HTML files that are growing the size of the Pulsar Git
repo. Each time we clone, the entire repo has to be fetched by
developers and users.

This is somewhat made worse by having daily updates in many HTML files
to update timestamps. I just merged a fix for that
https://github.com/apache/pulsar/pull/12538 .

The size of the clone git repo is already at 1.4 GB. 90% of this size
is due to the 'asf-site' branch.

Ideally, we should try to find a solution to use an ad-hoc repo for
the website deployment, outside the main Pulsar repo.

In the meantime, I propose to truncate the history of the "asf-site"
branch and squash all commits into a single one, in order to reduce
the repo size.

Let me know what you think.

Matteo

--
Matteo Merli
<mme...@apache.org>

Reply via email to