How big of a change would it be to have the repo only contain the Markdown source and not the rendered HTML (which should perhaps be moved to an object store)?
> On Aug 8, 2024, at 8:06 AM, Kent Yao <y...@apache.org> wrote: > > Hi dev, > > The current size of the spark-website repository is approximately 16GB, > exceeding the storage limit of GitHub-hosted runners. The GitHub actions > have been failing recently in the actions/checkout step caused by > 'No space left on device' errors. > > Filesystem Size Used Avail Use% Mounted on > overlay 73G 58G 16G 80% / > tmpfs 64M 0 64M 0% /dev > tmpfs 7.9G 0 7.9G 0% /sys/fs/cgroup > shm 64M 0 64M 0% /dev/shm > /dev/root 73G 58G 16G 80% /__w > tmpfs 1.6G 1.2M 1.6G 1% /run/docker.sock > tmpfs 7.9G 0 7.9G 0% /proc/acpi > tmpfs 7.9G 0 7.9G 0% /proc/scsi > tmpfs 7.9G 0 7.9G 0% /sys/firmware > > > The documentation for each version contributes the most volume. Since version > 3.5.0, the documentation size has grown 3-4 times larger than the > size of 3.4.x, > with more than 1GB. > > > 9.9M ./0.6.0 > 10M ./0.6.1 > 10M ./0.6.2 > 15M ./0.7.0 > 16M ./0.7.2 > 16M ./0.7.3 > 20M ./0.8.0 > 20M ./0.8.1 > 38M ./0.9.0 > 38M ./0.9.1 > 38M ./0.9.2 > 36M ./1.0.0 > 38M ./1.0.1 > 38M ./1.0.2 > 48M ./1.1.0 > 48M ./1.1.1 > 73M ./1.2.0 > 73M ./1.2.1 > 74M ./1.2.2 > 69M ./1.3.0 > 73M ./1.3.1 > 68M ./1.4.0 > 70M ./1.4.1 > 80M ./1.5.0 > 78M ./1.5.1 > 78M ./1.5.2 > 87M ./1.6.0 > 87M ./1.6.1 > 87M ./1.6.2 > 86M ./1.6.3 > 117M ./2.0.0 > 119M ./2.0.0-preview > 118M ./2.0.1 > 118M ./2.0.2 > 121M ./2.1.0 > 121M ./2.1.1 > 122M ./2.1.2 > 122M ./2.1.3 > 130M ./2.2.0 > 131M ./2.2.1 > 132M ./2.2.2 > 131M ./2.2.3 > 141M ./2.3.0 > 141M ./2.3.1 > 141M ./2.3.2 > 142M ./2.3.3 > 142M ./2.3.4 > 145M ./2.4.0 > 146M ./2.4.1 > 145M ./2.4.2 > 144M ./2.4.3 > 145M ./2.4.4 > 143M ./2.4.5 > 143M ./2.4.6 > 143M ./2.4.7 > 143M ./2.4.8 > 197M ./3.0.0 > 185M ./3.0.0-preview > 197M ./3.0.0-preview2 > 198M ./3.0.1 > 198M ./3.0.2 > 205M ./3.0.3 > 239M ./3.1.1 > 239M ./3.1.2 > 239M ./3.1.3 > 840M ./3.2.0 > 842M ./3.2.1 > 282M ./3.2.2 > 244M ./3.2.3 > 282M ./3.2.4 > 295M ./3.3.0 > 297M ./3.3.1 > 297M ./3.3.2 > 297M ./3.3.3 > 297M ./3.3.4 > 314M ./3.4.0 > 314M ./3.4.1 > 328M ./3.4.2 > 324M ./3.4.3 > 1.1G ./3.5.0 > 1.2G ./3.5.1 > 1.1G ./4.0.0-preview1 > > I'm concerned about publishing the documentation for version 3.5.2 > to the asf-site. So, I have merged PR[2] to eliminate this potential blocker. > > Considering that the problem still exists, should we temporarily archive > some of the outdated version documents? For example, only keep > the latest version for each feature release in the asf-site branch. Or, > Do you have any other suggestions? > > > Bests, > Kent Yao > > > [1] > https://docs.github.com/en/actions/using-github-hosted-runners/about-github-hosted-runners/about-github-hosted-runners#standard-github-hosted-runners-for-public-repositories > [2] https://github.com/apache/spark-website/pull/543 > > --------------------------------------------------------------------- > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org > --------------------------------------------------------------------- To unsubscribe e-mail: dev-unsubscr...@spark.apache.org