Hey everyone,

Thank Bowrna :)

 @Jarek Dream Team, Indeed.

@Amogh Sure, no problem, we can sync on Slack when you are back.

As Ryan mentioned, we need to be able to deal with two types of docs.
1. Docs that are published before 18 months, older docs.
2. Docs that are published somewhere between -18th month and now, newer
docs.

So, we can parallelly work on two fronts:

1. Older Docs

Once that's complete, we can implement a process to archive the raw .rst
> files for docs older than 18 months to S3 along with a way to download and
> build those in the airflow-site repo.
>       1. This will result in temporarily having two separate builds:
>          1. One for archived docs like we do now
>          2. And the build process for new docs developed in (1)
>       2. After 18 months, all of the archived docs will be out of the repo,
>       and we can move forward with only the build process developed in (1)
>
> Just one thought maybe we can enforce the process to achieve docs, maybe
via pre-commit hooks/updating the `breeze release-management publish-docs`
command. So that anytime there is something new published we also check the
docs to achieve.

2. Newer Docs

1. I think we should start with enabling Hugo in the documentation build
>    process for new releases
>       1. This may need to include a way to serve html from S3, as I think
>       we'll need to build each version for each package (apache-airflow &
>       providers). If we do this each time, the amount of docs built will
> grow
>       exponentially and we might find ourselves again in a similar
> situation
>       2. Once this is done, all new docs will be buildable without storing
>       the raw html locally
>       3. I think a good example (at least for a lot of this process) is how
>       the Apache Iceberg docs repo
>       <https://github.com/apache/iceberg-docs/tree/main> is built.
>

I think there would be one downside with this approach if we move newer
docs out to S3, suppose there are CSS changes done on the theme, we have to
regenerate the HTML for all providers and airflow and their versions anyway
and upload them to S3. TBH, I'm not sure how frequently these changes
occur. WDYT?

Current Tasks:
1. Create a script to extract the main content from sphinx-generated HTML
into Hugo's landing/pages/site/content/en/docs/ dir.
2. If you look at Airflow's website the sidebar and breadcrumb are dynamic
components of a page and they change from provider to provider. This means
we have to replicate its logic for those components in Hugo's templating
syntax.
3. Alter GitHub action the publishing logic.

Currently, I have started with the task of writing a script to fetch the
main content from Sphinx-generated HTML. If anyone wants to pick they can
start with task 2, I'll be pushing a PR to Airflow-site
<https://github.com/apache/airflow-site.git> with HUGO configured to cater
to static content for the sidebar, and breadcrumbs. We need to replicate
the logic for the sidebar and breadcrumbs in Hugo's templating syntax and
make it dynamic.

One downside, of replicating the sidebar and breadcrumb logic is that we
are violating the DRY principle, but I was eventually hoping that we make
Hugo's way to visualize the website primary and can depreciate the
`sphinx_airflow_theme` altogether. What do you all think?

Thanks,
Utkarsh Sharma

On Fri, Oct 27, 2023 at 4:40 PM Amogh Desai <amoghdesai....@gmail.com>
wrote:

> Yeah, excellent team!
>
> Utkarsh note that I will be on vacation from today till Nov 6. I should be
> able to help after that :)
>
> Even during this period i will have slack on mobile, so I can help
> asynchronously if needed.
>
>
> Thanks & Regards,
> Amogh Desai
>
> On Fri, Oct 27, 2023, 14:22 Jarek Potiuk <ja...@potiuk.com> wrote:
>
> > Whoa. Dream team :) .
> >
> > And of course - if you need any of my input of how it works or get
> > stuck with something - feel absolutely free to ping me on slack. While I
> > have not developed the build process I probably tinkered and touched it
> in
> > the past in many places and reverse engineered some parts of it so I
> might
> > save you some of the head-scratching.
> >
> > On Fri, Oct 27, 2023 at 6:35 AM Bowrna Prabhakaran <mailbow...@gmail.com
> >
> > wrote:
> >
> > > I would also like to join in this efforts.
> > >
> > >
> > > On Fri, Oct 27, 2023 at 8:19 AM Ryan Hatter
> > > <ryan.hat...@astronomer.io.invalid> wrote:
> > >
> > > > I'm happy to work on this alongside Utkarsh, Amogh Desai, and Aritra
> > Basu
> > > > :)
> > > > Some thoughts on Utkarsh's proposal (and what him and I have been
> > > > discussing offline):
> > > >
> > > >    1. I think we should start with enabling Hugo in the documentation
> > > build
> > > >    process for new releases
> > > >       1. This may need to include a way to serve html from S3, as I
> > think
> > > >       we'll need to build each version for each package
> > (apache-airflow &
> > > >       providers). If we do this each time, the amount of docs built
> > will
> > > > grow
> > > >       exponentially and we might find ourselves again in a similar
> > > > situation
> > > >       2. Once this is done, all new docs will be buildable without
> > > storing
> > > >       the raw html locally
> > > >       3. I think a good example (at least for a lot of this process)
> is
> > > how
> > > >       the Apache Iceberg docs repo
> > > >       <https://github.com/apache/iceberg-docs/tree/main> is built.
> > > >    2. Once that's complete, we can implement a process to archive the
> > raw
> > > >    .rst files for docs older than 18 months to S3 along with a way to
> > > > download
> > > >    and build those in the airflow-site repo.
> > > >       1. This will result in temporarily having two separate builds:
> > > >          1. One for archived docs like we do now
> > > >          2. And the build process for new docs developed in (1)
> > > >       2. After 18 months, all of the archived docs will be out of the
> > > repo,
> > > >       and we can move forward with only the build process developed
> in
> > > (1)
> > > >
> > > >
> > > > On Fri, Oct 27, 2023 at 7:55 AM utkarsh sharma <
> utkarshar...@gmail.com
> > >
> > > > wrote:
> > > >
> > > > > That sounds good, I'll start with creating smaller tickets for the
> > > above
> > > > > task, which I intend to do by the end of this week.
> > > > >
> > > > > Thanks,
> > > > > Utkarsh Sharma
> > > > >
> > > > >
> > > > > On Thu, Oct 26, 2023 at 4:16 PM Aritra Basu <
> > aritrabasu1...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Yup, sounds good to me let's go for it!
> > > > > >
> > > > > > --
> > > > > > Regards,
> > > > > > Aritra Basu
> > > > > >
> > > > > > On Thu, Oct 26, 2023, 1:47 PM Amogh Desai <
> > amoghdesai....@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Go ahead Utkarsh. It would be nice to work with you along this.
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Amogh Desai
> > > > > > >
> > > > > > > On Wed, Oct 25, 2023 at 10:02 PM Jarek Potiuk <
> ja...@potiuk.com>
> > > > > wrote:
> > > > > > >
> > > > > > > > +1. I think no-one will object to improve the current
> situation
> > > :)
> > > > > > > >
> > > > > > > > On Wed, Oct 25, 2023 at 5:02 PM utkarsh sharma <
> > > > > utkarshar...@gmail.com
> > > > > > >
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hey everyone,
> > > > > > > > >
> > > > > > > > > If we have a consensus on the suggestions in my previous
> > > email, I
> > > > > > would
> > > > > > > > > like to subdivide the task into smaller tickets and
> > distribute
> > > > them
> > > > > > > among
> > > > > > > > > Aritra Basu, Amogh Desai, and myself.
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > > Utkarsh Sharma
> > > > > > > > >
> > > > > > > > > On Tue, Oct 24, 2023 at 10:12 PM Jarek Potiuk <
> > > ja...@potiuk.com>
> > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Those look like great ideas.
> > > > > > > > > >
> > > > > > > > > > On Tue, Oct 24, 2023 at 4:23 PM utkarsh sharma <
> > > > > > > utkarshar...@gmail.com
> > > > > > > > >
> > > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Just forgot to mention in my previous mail, that I'm
> > > > suggesting
> > > > > > the
> > > > > > > > > above
> > > > > > > > > > > changes since the storage is not the primary concern
> > right
> > > > now
> > > > > > but
> > > > > > > > I'm
> > > > > > > > > > > happy to contribute either way. :)
> > > > > > > > > > >
> > > > > > > > > > > On Tue, Oct 24, 2023 at 7:43 PM utkarsh sharma <
> > > > > > > > utkarshar...@gmail.com
> > > > > > > > > >
> > > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hey everyone,
> > > > > > > > > > > >
> > > > > > > > > > > > I have a couple of tasks in mind, that might aid in
> > > > reducing
> > > > > > the
> > > > > > > > > > efforts
> > > > > > > > > > > > while working with docs. Right now tasks listed below
> > are
> > > > > > > difficult
> > > > > > > > > to
> > > > > > > > > > > > achieve.
> > > > > > > > > > > >
> > > > > > > > > > > > 1. Adding a warning based on a specific
> > provider/version
> > > > of a
> > > > > > > > > > > > provider/range of providers. Which was also the task
> > that
> > > > > Ryan
> > > > > > > was
> > > > > > > > > > > working
> > > > > > > > > > > > on.
> > > > > > > > > > > > 2. Altering a page layout or CSS for a specific
> > provider.
> > > > > > > > > > > >
> > > > > > > > > > > > The issue while trying to achieve the above tasks is
> > > > because
> > > > > of
> > > > > > > the
> > > > > > > > > > > > pre-prepared static files we get as a final product
> of
> > > > > building
> > > > > > > > > > documents
> > > > > > > > > > > > with *breeze build-docs* in folder docs/_build. The
> > files
> > > > we
> > > > > > get
> > > > > > > > are
> > > > > > > > > > > > self-sufficient to be hosted and they are really just
> > > used
> > > > > > > directly
> > > > > > > > > > > leaving
> > > > > > > > > > > > no room for customization of any sort.
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > My proposal would be to break down this process as
> > > follows:
> > > > > > > > > > > >
> > > > > > > > > > > > 1. We can prepare partial documents as part of
> *breeze
> > > > > > > build-docs*
> > > > > > > > > > which
> > > > > > > > > > > > are only responsible for providing HTML to be
> populated
> > > > > within
> > > > > > > the
> > > > > > > > > Body
> > > > > > > > > > > tag
> > > > > > > > > > > > for a specific provider, and not the layout of the
> > entire
> > > > > page.
> > > > > > > > > > > > 2. We then copy partial static files to the
> > Airflow-site
> > > > repo
> > > > > > > > within
> > > > > > > > > > > > landing pages/site/layouts/docs. Where the layout of
> > the
> > > > page
> > > > > > > will
> > > > > > > > be
> > > > > > > > > > > > provided by `single.html`, a listing of all the
> > providers
> > > > > will
> > > > > > be
> > > > > > > > > > > provided
> > > > > > > > > > > > by `list.html`, which are standard hugo
> > > > > > > > > > > > <https://gohugo.io/about/what-is-hugo/> features.
> > Also,
> > > > > using
> > > > > > > > static
> > > > > > > > > > > > files from `sphinx_airflow_theme` which lives in the
> > same
> > > > > repo,
> > > > > > > > makes
> > > > > > > > > > the
> > > > > > > > > > > > changes on the CSS easy.
> > > > > > > > > > > > 3. We can then use Hugo to generate static
> > > > > > > > > > > > <
> > > > > > https://gohugo.io/getting-started/quick-start/#publish-the-site
> > > > > > > >
> > > > > > > > > > files
> > > > > > > > > > > > and push them to the `gh-pages` branch to publish
> them
> > > > using
> > > > > > > GitHub
> > > > > > > > > > > pages.
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Doing the above changes will enable us to do the
> > > following:
> > > > > > > > > > > >
> > > > > > > > > > > > 1. Will give us more control to work on a specific
> > > > > > > > > > > > provider/provider-version if we want by providing
> > > > templates -
> > > > > > > > > > > > https://gohugo.io/templates/lookup-order/
> > > > > > > > > > > > 2. We will have a specific code to look at depending
> on
> > > the
> > > > > > > changes
> > > > > > > > > one
> > > > > > > > > > > > intends to make, right now if you don't know the flow
> > > it's
> > > > a
> > > > > > bit
> > > > > > > > > > > difficult
> > > > > > > > > > > > to pinpoint the code to change.
> > > > > > > > > > > > 1. If we want to make changes to a specific
> provider's
> > > > > content
> > > > > > we
> > > > > > > > can
> > > > > > > > > > do
> > > > > > > > > > > > it Airflow's repo docs/<provider>/*.rst file.
> > > > > > > > > > > > 2. If we have a change that affects multiple
> providers
> > or
> > > > > > > versions
> > > > > > > > we
> > > > > > > > > > can
> > > > > > > > > > > > do it on Airflow Website's repo.
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks,
> > > > > > > > > > > > Utkarsh Sharma
> > > > > > > > > > > >
> > > > > > > > > > > > On Tue, Oct 24, 2023 at 3:45 PM Jarek Potiuk <
> > > > > ja...@potiuk.com
> > > > > > >
> > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > >> So it looks like we have some helping hands and we
> > need
> > > > > > someone
> > > > > > > to
> > > > > > > > > > lead
> > > > > > > > > > > it
> > > > > > > > > > > >> :) (just saying).
> > > > > > > > > > > >>
> > > > > > > > > > > >> On Tue, Oct 24, 2023 at 8:15 AM Amogh Desai <
> > > > > > > > > amoghdesai....@gmail.com
> > > > > > > > > > >
> > > > > > > > > > > >> wrote:
> > > > > > > > > > > >>
> > > > > > > > > > > >> > +1 (non binding) from me on the thought of moving
> > the
> > > > > older
> > > > > > > docs
> > > > > > > > > > (~18
> > > > > > > > > > > >> > months seems ok) to an archive instead of the
> > > > repository.
> > > > > > > > > > > >> >
> > > > > > > > > > > >> > Coming to the other problem of copying the built
> > docs
> > > > into
> > > > > > > > > > > airflow-site
> > > > > > > > > > > >> for
> > > > > > > > > > > >> > releases, maybe we can fix that using a script?
> Open
> > > for
> > > > > > > > thoughts
> > > > > > > > > > > here.
> > > > > > > > > > > >> >
> > > > > > > > > > > >> > I would be very happy to help when we start taking
> > > this
> > > > > > > > forward, I
> > > > > > > > > > > have
> > > > > > > > > > > >> > some experience in airflow-site and docs side as
> > well.
> > > > > Feel
> > > > > > > free
> > > > > > > > > to
> > > > > > > > > > > >> reach
> > > > > > > > > > > >> > out over email or slack :)
> > > > > > > > > > > >> >
> > > > > > > > > > > >> > Thanks & Regards,
> > > > > > > > > > > >> > Amogh Desai
> > > > > > > > > > > >> >
> > > > > > > > > > > >> > On Mon, Oct 23, 2023 at 3:08 AM Aritra Basu <
> > > > > > > > > > aritrabasu1...@gmail.com
> > > > > > > > > > > >
> > > > > > > > > > > >> > wrote:
> > > > > > > > > > > >> >
> > > > > > > > > > > >> > > This definitely sounds like something that needs
> > > doing
> > > > > > > sooner
> > > > > > > > > > rather
> > > > > > > > > > > >> than
> > > > > > > > > > > >> > > later.
> > > > > > > > > > > >> > >
> > > > > > > > > > > >> > > While I'd love to help, I'm not too experienced
> > with
> > > > > this
> > > > > > > area
> > > > > > > > > so
> > > > > > > > > > I
> > > > > > > > > > > >> might
> > > > > > > > > > > >> > > not be able to actually propose what changes
> need
> > > > doing,
> > > > > > but
> > > > > > > > if
> > > > > > > > > > > >> someone
> > > > > > > > > > > >> > has
> > > > > > > > > > > >> > > a path forward on this I can definitely
> contribute
> > > > some
> > > > > > time
> > > > > > > > to
> > > > > > > > > > help
> > > > > > > > > > > >> out
> > > > > > > > > > > >> > > given some guidance on what is needed.
> > > > > > > > > > > >> > >
> > > > > > > > > > > >> > > --
> > > > > > > > > > > >> > > Regards,
> > > > > > > > > > > >> > > Aritra Basu
> > > > > > > > > > > >> > >
> > > > > > > > > > > >> > > On Mon, Oct 23, 2023, 2:19 AM Jarek Potiuk <
> > > > > > > ja...@potiuk.com>
> > > > > > > > > > > wrote:
> > > > > > > > > > > >> > >
> > > > > > > > > > > >> > > > Some news here.
> > > > > > > > > > > >> > > >
> > > > > > > > > > > >> > > > I caught up with some infra changes that
> > happened
> > > > > while
> > > > > > I
> > > > > > > > was
> > > > > > > > > > > >> > travelling
> > > > > > > > > > > >> > > -
> > > > > > > > > > > >> > > > and I have just (with
> > > > > > > > > > > >> https://github.com/apache/airflow-site/pull/879)
> > > > > > > > > > > >> > > > switched the "airflow-site" building to the
> new,
> > > > > > > self-hosted
> > > > > > > > > > > >> > > "asf-runners".
> > > > > > > > > > > >> > > > This is a new option that ASF infra has given
> to
> > > > test
> > > > > > for
> > > > > > > > the
> > > > > > > > > > ASF
> > > > > > > > > > > >> > > projects
> > > > > > > > > > > >> > > > - rather than relying on "public runners", we
> > can
> > > > > switch
> > > > > > > to
> > > > > > > > > > > >> self-hosted
> > > > > > > > > > > >> > > > runners donated by Microsoft to the ASF. More
> > info
> > > > > here:
> > > > > > > > > > > >> > > >
> > > > > > > > > > > >> > > >
> > > > > > > > > > > >> > >
> > > > > > > > > > > >> >
> > > > > > > > > > > >>
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?spaceKey=INFRA&title=ASF+Infra+provided+self-hosted+runners
> > > > > > > > > > > >> > > >
> > > > > > > > > > > >> > > > The most important result is that we now have
> a
> > > lot
> > > > > more
> > > > > > > > > > > "breathing
> > > > > > > > > > > >> > > space"
> > > > > > > > > > > >> > > > for the docs building job. During the build we
> > are
> > > > > using
> > > > > > > max
> > > > > > > > > 59%
> > > > > > > > > > > of
> > > > > > > > > > > >> the
> > > > > > > > > > > >> > > > disk space - with 73GB used and 52GB free.
> > > > > > > > > > > >> > > >
> > > > > > > > > > > >> > > >  Filesystem      Size  Used Avail Use% Mounted
> > on
> > > > > > > > > > > >> > > >   overlay         124G   73G   52G  59% /
> > > > > > > > > > > >> > > >
> > > > > > > > > > > >> > > > This is - on one hand - good news (disk space
> is
> > > not
> > > > > an
> > > > > > > > > "acute"
> > > > > > > > > > > >> issue
> > > > > > > > > > > >> > any
> > > > > > > > > > > >> > > > more), I think if someone would like to work
> on
> > > > > > improving
> > > > > > > > the
> > > > > > > > > > docs
> > > > > > > > > > > >> > > building
> > > > > > > > > > > >> > > > of ours, they have much more breathing space
> to
> > do
> > > > so.
> > > > > > > > > > > >> > > > But - clearly - it might mean that the
> incentive
> > > to
> > > > > work
> > > > > > > on
> > > > > > > > it
> > > > > > > > > > > >> > decreased
> > > > > > > > > > > >> > > -
> > > > > > > > > > > >> > > > because it "just works"). That's the bad
> effect
> > of
> > > > it.
> > > > > > > And I
> > > > > > > > > > think
> > > > > > > > > > > >> it's
> > > > > > > > > > > >> > > not
> > > > > > > > > > > >> > > > good, though the most I can do is to reiterate
> > > > Ryan's
> > > > > > > > concerns
> > > > > > > > > > and
> > > > > > > > > > > >> hope
> > > > > > > > > > > >> > > we
> > > > > > > > > > > >> > > > will get someone committing to improving this.
> > > > > > > > > > > >> > > >
> > > > > > > > > > > >> > > > I would strongly encourage those who want to
> > > improve
> > > > > it,
> > > > > > > to
> > > > > > > > do
> > > > > > > > > > > so. I
> > > > > > > > > > > >> > > think
> > > > > > > > > > > >> > > > - as Ryan stated - contributing to our docs is
> > > more
> > > > > > > complex
> > > > > > > > > than
> > > > > > > > > > > it
> > > > > > > > > > > >> > > should
> > > > > > > > > > > >> > > > be and anyone who would like to contribute
> there
> > > is
> > > > > most
> > > > > > > > > > welcome.
> > > > > > > > > > > I
> > > > > > > > > > > >> > very
> > > > > > > > > > > >> > > > much share all the points that Ryan made and I
> > > think
> > > > > we
> > > > > > > > should
> > > > > > > > > > > >> welcome
> > > > > > > > > > > >> > > any
> > > > > > > > > > > >> > > > efforts to make it better. The lack of
> > > > > > > > incremental/auto-build
> > > > > > > > > > > >> support
> > > > > > > > > > > >> > is
> > > > > > > > > > > >> > > > especially troublesome for anyone who wants to
> > > > > > contribute
> > > > > > > > > their
> > > > > > > > > > > >> docs.
> > > > > > > > > > > >> > > Happy
> > > > > > > > > > > >> > > > to help anyone who would like to take on the
> > task.
> > > > > > > > > > > >> > > >
> > > > > > > > > > > >> > > > Still - if we would like to move old docs
> > outside
> > > > as a
> > > > > > > first
> > > > > > > > > > step,
> > > > > > > > > > > >> I am
> > > > > > > > > > > >> > > > happy to help anyone who would like to commit
> to
> > > > doing
> > > > > > it.
> > > > > > > > > > > >> > > >
> > > > > > > > > > > >> > > > J.
> > > > > > > > > > > >> > > >
> > > > > > > > > > > >> > > > On Fri, Oct 20, 2023 at 3:27 PM Pierre
> Jeambrun
> > <
> > > > > > > > > > > >> pierrejb...@gmail.com
> > > > > > > > > > > >> > >
> > > > > > > > > > > >> > > > wrote:
> > > > > > > > > > > >> > > >
> > > > > > > > > > > >> > > > > +1 from moving archived docs outside of
> > > > > airflow-site.
> > > > > > > > > > > >> > > > >
> > > > > > > > > > > >> > > > > Even if that might mean a little more
> > > maintenance
> > > > in
> > > > > > > case
> > > > > > > > we
> > > > > > > > > > > need
> > > > > > > > > > > >> to
> > > > > > > > > > > >> > > > > propagate changes to all historical
> versions,
> > we
> > > > > would
> > > > > > > > have
> > > > > > > > > to
> > > > > > > > > > > >> > handle 2
> > > > > > > > > > > >> > > > > repositories, but that seems like a minor
> > > downside
> > > > > > > > compared
> > > > > > > > > to
> > > > > > > > > > > the
> > > > > > > > > > > >> > > > quality
> > > > > > > > > > > >> > > > > of life improvement that it would bring for
> > > > > > airflow-site
> > > > > > > > > > > >> > contributions.
> > > > > > > > > > > >> > > > >
> > > > > > > > > > > >> > > > > Le jeu. 19 oct. 2023 à 16:11, Jarek Potiuk <
> > > > > > > > > ja...@potiuk.com>
> > > > > > > > > > a
> > > > > > > > > > > >> > écrit
> > > > > > > > > > > >> > > :
> > > > > > > > > > > >> > > > >
> > > > > > > > > > > >> > > > > > Let me just clarify (because that could be
> > > > > unclear)
> > > > > > > what
> > > > > > > > > my
> > > > > > > > > > +1
> > > > > > > > > > > >> was
> > > > > > > > > > > >> > > > about.
> > > > > > > > > > > >> > > > > >
> > > > > > > > > > > >> > > > > > I was not talking (and I believe Ryan was
> > not
> > > > > > talking
> > > > > > > > > > either)
> > > > > > > > > > > >> about
> > > > > > > > > > > >> > > > > > removing the old docs but about archiving
> > them
> > > > and
> > > > > > > > serving
> > > > > > > > > > > from
> > > > > > > > > > > >> > > > elsewhere
> > > > > > > > > > > >> > > > > > (cloud storage).
> > > > > > > > > > > >> > > > > >
> > > > > > > > > > > >> > > > > > I think discussing changing to more shared
> > > > > > HTML/JS/CSS
> > > > > > > > is
> > > > > > > > > > > also a
> > > > > > > > > > > >> > good
> > > > > > > > > > > >> > > > > idea
> > > > > > > > > > > >> > > > > > to optimise it, but possibly can be
> handled
> > > > > > separately
> > > > > > > > as
> > > > > > > > > a
> > > > > > > > > > > >> longer
> > > > > > > > > > > >> > > > effort
> > > > > > > > > > > >> > > > > > of redesigning how the docs are built. But
> > by
> > > > all
> > > > > > > means
> > > > > > > > we
> > > > > > > > > > > could
> > > > > > > > > > > >> > also
> > > > > > > > > > > >> > > > > work
> > > > > > > > > > > >> > > > > > on that.
> > > > > > > > > > > >> > > > > >
> > > > > > > > > > > >> > > > > > Maybe I jumped to conclusions, but the
> > > easiest,
> > > > > > > tactical
> > > > > > > > > > > >> solution
> > > > > > > > > > > >> > > (for
> > > > > > > > > > > >> > > > > the
> > > > > > > > > > > >> > > > > > most acute issue - size) is we just move
> the
> > > old
> > > > > > > > generated
> > > > > > > > > > > HTML
> > > > > > > > > > > >> > docs
> > > > > > > > > > > >> > > > from
> > > > > > > > > > > >> > > > > > the git repository of "airflow-site" and
> in
> > > the
> > > > > > > > > > "github_pages"
> > > > > > > > > > > >> > branch
> > > > > > > > > > > >> > > > we
> > > > > > > > > > > >> > > > > > replace it with redirecting of those pages
> > to
> > > > the
> > > > > > > files
> > > > > > > > > > served
> > > > > > > > > > > >> from
> > > > > > > > > > > >> > > the
> > > > > > > > > > > >> > > > > > cloud storage (and I believe this is what
> > Ryan
> > > > > > hinted
> > > > > > > > at).
> > > > > > > > > > > >> > > > > >
> > > > > > > > > > > >> > > > > > Those redirects could be automatically
> > > generated
> > > > > for
> > > > > > > all
> > > > > > > > > > > >> > > > > > historical versions and they will be
> small.
> > > We
> > > > > are
> > > > > > > > > already
> > > > > > > > > > > >> doing
> > > > > > > > > > > >> > it
> > > > > > > > > > > >> > > > for
> > > > > > > > > > > >> > > > > > individual pages for navigating between
> > > > versions,
> > > > > > but
> > > > > > > we
> > > > > > > > > > could
> > > > > > > > > > > >> > easily
> > > > > > > > > > > >> > > > > > replace all the historical docs with
> > > > > > > "<html><head><meta
> > > > > > > > > > > >> > > > > > http-equiv="refresh" content="0; url=
> > > > > > > > > > > >> > > > > >
> > > > > > > > > > > >>
> > > > > > >
> > https://new-archive-docs-airflow-url/airflow/version/document.url
> > > > > > > > "
> > > > > > > > > > > >> > > > > > "/></head></html>". Low-tech, surely and
> > > > "legacy",
> > > > > > but
> > > > > > > > it
> > > > > > > > > > will
> > > > > > > > > > > >> > solve
> > > > > > > > > > > >> > > > the
> > > > > > > > > > > >> > > > > > size problem instantly. We currently have
> > > > 115.148
> > > > > > such
> > > > > > > > > files
> > > > > > > > > > > >> which
> > > > > > > > > > > >> > > will
> > > > > > > > > > > >> > > > > go
> > > > > > > > > > > >> > > > > > down to about ~20 MB of files which is
> > > peanuts,
> > > > > > > compared
> > > > > > > > > to
> > > > > > > > > > > the
> > > > > > > > > > > >> > > current
> > > > > > > > > > > >> > > > > > 17GB (!) we have.
> > > > > > > > > > > >> > > > > >
> > > > > > > > > > > >> > > > > > We can also inject into the moved
> "storage"
> > > > docs,
> > > > > > the
> > > > > > > > > header
> > > > > > > > > > > >> that
> > > > > > > > > > > >> > > > informs
> > > > > > > > > > > >> > > > > > that this is an old/archived documentation
> > > with
> > > > > > single
> > > > > > > > > > > redirect
> > > > > > > > > > > >> to
> > > > > > > > > > > >> > > > > > "live"/"stable" site for newer versions of
> > > docs
> > > > > > > (which I
> > > > > > > > > > > believe
> > > > > > > > > > > >> > > > sparked
> > > > > > > > > > > >> > > > > > Ryan's work). This can be done at least as
> > the
> > > > > > "quick"
> > > > > > > > > > > >> remediation
> > > > > > > > > > > >> > > for
> > > > > > > > > > > >> > > > > the
> > > > > > > > > > > >> > > > > > size issue and something that might allow
> > the
> > > > > > current
> > > > > > > > > scheme
> > > > > > > > > > > to
> > > > > > > > > > > >> > > > > > work without ever-growing repo/size and
> > using
> > > > > space
> > > > > > > for
> > > > > > > > > the
> > > > > > > > > > > >> build
> > > > > > > > > > > >> > > > action.
> > > > > > > > > > > >> > > > > > If we have such an automated mechanism in
> > > place,
> > > > > we
> > > > > > > > could
> > > > > > > > > > > >> > > periodically
> > > > > > > > > > > >> > > > > > archive old docs. All that without
> changing
> > > the
> > > > > > build
> > > > > > > > > > process
> > > > > > > > > > > of
> > > > > > > > > > > >> > ours
> > > > > > > > > > > >> > > > and
> > > > > > > > > > > >> > > > > > simply keep old "past" docs elsewhere
> (still
> > > > > > > accessible
> > > > > > > > > for
> > > > > > > > > > > >> users).
> > > > > > > > > > > >> > > > > >
> > > > > > > > > > > >> > > > > > Not much should change for the users IMHO
> -
> > if
> > > > > they
> > > > > > go
> > > > > > > > to
> > > > > > > > > > the
> > > > > > > > > > > >> old
> > > > > > > > > > > >> > > > version
> > > > > > > > > > > >> > > > > > of the docs or use old, archived URLs,
> they
> > > > would
> > > > > > end
> > > > > > > up
> > > > > > > > > > > seeing
> > > > > > > > > > > >> the
> > > > > > > > > > > >> > > > > > same content/navigation they see today
> (with
> > > > extra
> > > > > > > > > > information
> > > > > > > > > > > >> it's
> > > > > > > > > > > >> > > an
> > > > > > > > > > > >> > > > > old
> > > > > > > > > > > >> > > > > > version and served from a different URL).
> > > > > > > > > > > >> > > > > > When they go to the "old" version of
> > > > documentation
> > > > > > > they
> > > > > > > > > > could
> > > > > > > > > > > be
> > > > > > > > > > > >> > > > > redirected
> > > > > > > > > > > >> > > > > > to the new one - same HTML but hosted on
> > cloud
> > > > > > > storage,
> > > > > > > > > > fully
> > > > > > > > > > > >> > > > statically.
> > > > > > > > > > > >> > > > > > We already do that with "redirect"
> > mechanism.
> > > > > > > > > > > >> > > > > >
> > > > > > > > > > > >> > > > > > In the meantime, someone could also work
> on
> > a
> > > > > > > strategic
> > > > > > > > > > > >> solution -
> > > > > > > > > > > >> > > and
> > > > > > > > > > > >> > > > > > changing the current build process, but
> this
> > > is
> > > > -
> > > > > I
> > > > > > > > think
> > > > > > > > > a
> > > > > > > > > > > >> > > different -
> > > > > > > > > > > >> > > > > > and much more complex and requiring a lot
> of
> > > > > effort
> > > > > > -
> > > > > > > > > step.
> > > > > > > > > > > And
> > > > > > > > > > > >> it
> > > > > > > > > > > >> > > > could
> > > > > > > > > > > >> > > > > > simply end up with regenerating whatever
> is
> > > left
> > > > > as
> > > > > > > > "live"
> > > > > > > > > > > >> > > > documentation
> > > > > > > > > > > >> > > > > > (leaving the archive docs intact).
> > > > > > > > > > > >> > > > > >
> > > > > > > > > > > >> > > > > > That's at least what I see as a possible
> set
> > > of
> > > > > > steps
> > > > > > > to
> > > > > > > > > > take.
> > > > > > > > > > > >> > > > > >
> > > > > > > > > > > >> > > > > > J.
> > > > > > > > > > > >> > > > > >
> > > > > > > > > > > >> > > > > > On Thu, Oct 19, 2023 at 2:14 PM utkarsh
> > > sharma <
> > > > > > > > > > > >> > > utkarshar...@gmail.com
> > > > > > > > > > > >> > > > >
> > > > > > > > > > > >> > > > > > wrote:
> > > > > > > > > > > >> > > > > >
> > > > > > > > > > > >> > > > > > > Hey everyone,
> > > > > > > > > > > >> > > > > > >
> > > > > > > > > > > >> > > > > > > Thanks, Ryan for stating the thread :)
> > > > > > > > > > > >> > > > > > >
> > > > > > > > > > > >> > > > > > > Big +1 For archiving docs older than 18
> > > > months.
> > > > > We
> > > > > > > can
> > > > > > > > > > still
> > > > > > > > > > > >> make
> > > > > > > > > > > >> > > the
> > > > > > > > > > > >> > > > > > older
> > > > > > > > > > > >> > > > > > > docs available in `rst` doc form.
> > > > > > > > > > > >> > > > > > >
> > > > > > > > > > > >> > > > > > > But eventually, we might again run into
> > this
> > > > > > problem
> > > > > > > > > > because
> > > > > > > > > > > >> of
> > > > > > > > > > > >> > the
> > > > > > > > > > > >> > > > > > growing
> > > > > > > > > > > >> > > > > > > no. of providers. I think the main
> reason
> > > for
> > > > > this
> > > > > > > > issue
> > > > > > > > > > is
> > > > > > > > > > > >> the
> > > > > > > > > > > >> > > > > generated
> > > > > > > > > > > >> > > > > > > static HTML pages and the way we cater
> to
> > > them
> > > > > > using
> > > > > > > > > > GitHub
> > > > > > > > > > > >> > Pages.
> > > > > > > > > > > >> > > > The
> > > > > > > > > > > >> > > > > > > generated pages have lots of common code
> > > > > > > > > > > >> > > > > > >
> HTML(headers/navigation/breadcrumbs/footer
> > > > etc.)
> > > > > > > CSS,
> > > > > > > > JS
> > > > > > > > > > > >> which is
> > > > > > > > > > > >> > > > > > repeated
> > > > > > > > > > > >> > > > > > > for every provider and every version of
> > that
> > > > > > > provider.
> > > > > > > > > If
> > > > > > > > > > we
> > > > > > > > > > > >> > have a
> > > > > > > > > > > >> > > > > more
> > > > > > > > > > > >> > > > > > > dynamic way(Django/Flask Servers) of
> > > catering
> > > > > the
> > > > > > > > > > documents
> > > > > > > > > > > we
> > > > > > > > > > > >> > can
> > > > > > > > > > > >> > > > save
> > > > > > > > > > > >> > > > > > all
> > > > > > > > > > > >> > > > > > > the space for common HTML/CSS/JS.
> > > > > > > > > > > >> > > > > > >
> > > > > > > > > > > >> > > > > > > But the downsides of this approach are:
> > > > > > > > > > > >> > > > > > > 1. We need to have a server
> > > > > > > > > > > >> > > > > > > 2. Also require changes in the existing
> > > > document
> > > > > > > build
> > > > > > > > > > > >> process to
> > > > > > > > > > > >> > > > only
> > > > > > > > > > > >> > > > > > > produce partial HTML documents.
> > > > > > > > > > > >> > > > > > >
> > > > > > > > > > > >> > > > > > > Thanks,
> > > > > > > > > > > >> > > > > > > Utkarsh Sharma
> > > > > > > > > > > >> > > > > > >
> > > > > > > > > > > >> > > > > > > On Thu, Oct 19, 2023 at 4:08 PM Jarek
> > > Potiuk <
> > > > > > > > > > > >> ja...@potiuk.com>
> > > > > > > > > > > >> > > > wrote:
> > > > > > > > > > > >> > > > > > >
> > > > > > > > > > > >> > > > > > > > Yes. Moving the old version to
> somewhere
> > > > that
> > > > > we
> > > > > > > can
> > > > > > > > > > > >> > keep/archive
> > > > > > > > > > > >> > > > > > static
> > > > > > > > > > > >> > > > > > > > historical versions of those
> historical
> > > docs
> > > > > and
> > > > > > > > > publish
> > > > > > > > > > > >> them
> > > > > > > > > > > >> > > from
> > > > > > > > > > > >> > > > > > there.
> > > > > > > > > > > >> > > > > > > > What you proposed is exactly the
> > solution
> > > I
> > > > > > > thought
> > > > > > > > > > might
> > > > > > > > > > > be
> > > > > > > > > > > >> > best
> > > > > > > > > > > >> > > > as
> > > > > > > > > > > >> > > > > > > well.
> > > > > > > > > > > >> > > > > > > >
> > > > > > > > > > > >> > > > > > > > It would be a great task to contribute
> > to
> > > > the
> > > > > > > > > stability
> > > > > > > > > > of
> > > > > > > > > > > >> our
> > > > > > > > > > > >> > > docs
> > > > > > > > > > > >> > > > > > > > generation in the future.
> > > > > > > > > > > >> > > > > > > >
> > > > > > > > > > > >> > > > > > > > I don't think it's a matter of
> > discussing
> > > in
> > > > > > > detail
> > > > > > > > > how
> > > > > > > > > > to
> > > > > > > > > > > >> do
> > > > > > > > > > > >> > it
> > > > > > > > > > > >> > > > (18
> > > > > > > > > > > >> > > > > > > months
> > > > > > > > > > > >> > > > > > > > is a good start and you can
> parameterize
> > > > it),
> > > > > > It's
> > > > > > > > the
> > > > > > > > > > > >> matter
> > > > > > > > > > > >> > of
> > > > > > > > > > > >> > > > > > > > someone committing to it and doing it
> > > simply
> > > > > :).
> > > > > > > > > > > >> > > > > > > >
> > > > > > > > > > > >> > > > > > > > So yes I personally am all for it and
> > if I
> > > > > > > > understand
> > > > > > > > > > > >> correctly
> > > > > > > > > > > >> > > > that
> > > > > > > > > > > >> > > > > > you
> > > > > > > > > > > >> > > > > > > > are looking for agreement on doing it,
> > big
> > > > +1
> > > > > > from
> > > > > > > > my
> > > > > > > > > > > side -
> > > > > > > > > > > >> > > happy
> > > > > > > > > > > >> > > > to
> > > > > > > > > > > >> > > > > > > help
> > > > > > > > > > > >> > > > > > > > with providing access to our S3
> buckets.
> > > > > > > > > > > >> > > > > > > >
> > > > > > > > > > > >> > > > > > > > J.
> > > > > > > > > > > >> > > > > > > >
> > > > > > > > > > > >> > > > > > > > On Thu, Oct 19, 2023 at 5:39 AM Ryan
> > > Hatter
> > > > > > > > > > > >> > > > > > > > <ryan.hat...@astronomer.io.invalid>
> > > wrote:
> > > > > > > > > > > >> > > > > > > >
> > > > > > > > > > > >> > > > > > > > > *tl;dr*
> > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > >> > > > > > > > >    1. The GitHub Action for building
> > > docs
> > > > is
> > > > > > > > running
> > > > > > > > > > out
> > > > > > > > > > > >> of
> > > > > > > > > > > >> > > > space.
> > > > > > > > > > > >> > > > > I
> > > > > > > > > > > >> > > > > > > > think
> > > > > > > > > > > >> > > > > > > > >    we should archive really old
> > > > > documentation
> > > > > > > for
> > > > > > > > > > large
> > > > > > > > > > > >> > > packages
> > > > > > > > > > > >> > > > to
> > > > > > > > > > > >> > > > > > > cloud
> > > > > > > > > > > >> > > > > > > > >    storage.
> > > > > > > > > > > >> > > > > > > > >    2. Contributing to and building
> > > Airflow
> > > > > > docs
> > > > > > > is
> > > > > > > > > > hard.
> > > > > > > > > > > >> We
> > > > > > > > > > > >> > > > should
> > > > > > > > > > > >> > > > > > > > migrate
> > > > > > > > > > > >> > > > > > > > >    to a framework, preferably one
> that
> > > > uses
> > > > > > > > markdown
> > > > > > > > > > > >> > (although
> > > > > > > > > > > >> > > I
> > > > > > > > > > > >> > > > > > > > > acknowledge
> > > > > > > > > > > >> > > > > > > > >    rst -> md will be a massive
> > > overhaul).
> > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > >> > > > > > > > > *Problem Summary*
> > > > > > > > > > > >> > > > > > > > > I recently set out to implement
> what I
> > > > > thought
> > > > > > > > would
> > > > > > > > > > be
> > > > > > > > > > > a
> > > > > > > > > > > >> > > > > > > straightforward
> > > > > > > > > > > >> > > > > > > > > feature: warn users when they are
> > > viewing
> > > > > > > > > > documentation
> > > > > > > > > > > >> for
> > > > > > > > > > > >> > > > > > non-current
> > > > > > > > > > > >> > > > > > > > > versions of Airflow and link them to
> > the
> > > > > > > > > > current/stable
> > > > > > > > > > > >> > version
> > > > > > > > > > > >> > > > > > > > > <
> > > > > https://github.com/apache/airflow/pull/34639
> > > > > > >.
> > > > > > > > Jed
> > > > > > > > > > > >> pointed
> > > > > > > > > > > >> > me
> > > > > > > > > > > >> > > > to
> > > > > > > > > > > >> > > > > > the
> > > > > > > > > > > >> > > > > > > > > airflow-site <
> > > > > > > > > https://github.com/apache/airflow-site>
> > > > > > > > > > > >> repo,
> > > > > > > > > > > >> > > > which
> > > > > > > > > > > >> > > > > > > > contains
> > > > > > > > > > > >> > > > > > > > > all of the archived docs (that is,
> > > > > > documentation
> > > > > > > > for
> > > > > > > > > > > >> > > non-current
> > > > > > > > > > > >> > > > > > > > versions),
> > > > > > > > > > > >> > > > > > > > > and from there, I ran into a brick
> > wall.
> > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > >> > > > > > > > > I want to raise some concerns that
> > I've
> > > > > > > developed
> > > > > > > > > > after
> > > > > > > > > > > >> > trying
> > > > > > > > > > > >> > > to
> > > > > > > > > > > >> > > > > > > > > contribute what feel like a couple
> > > > > reasonably
> > > > > > > > small
> > > > > > > > > > docs
> > > > > > > > > > > >> > > updates:
> > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > >> > > > > > > > >    1. airflow-site
> > > > > > > > > > > >> > > > > > > > >       1. Elad pointed out the
> problem
> > > > posed
> > > > > by
> > > > > > > the
> > > > > > > > > > sheer
> > > > > > > > > > > >> size
> > > > > > > > > > > >> > > of
> > > > > > > > > > > >> > > > > > > archived
> > > > > > > > > > > >> > > > > > > > >       docs
> > > > > > > > > > > >> > > > > > > > >       <
> > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > >> > > > > > > >
> > > > > > > > > > > >> > > > > > >
> > > > > > > > > > > >> > > > > >
> > > > > > > > > > > >> > > > >
> > > > > > > > > > > >> > > >
> > > > > > > > > > > >> > >
> > > > > > > > > > > >> >
> > > > > > > > > > > >>
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://apache-airflow.slack.com/archives/CCPRP7943/p1697009000242369?thread_ts=1696973512.004229&cid=CCPRP7943
> > > > > > > > > > > >> > > > > > > > > >
> > > > > > > > > > > >> > > > > > > > > (more
> > > > > > > > > > > >> > > > > > > > >       on this later).
> > > > > > > > > > > >> > > > > > > > >       2. The airflow-site repo is
> > > > confusing,
> > > > > > and
> > > > > > > > > > rather
> > > > > > > > > > > >> > poorly
> > > > > > > > > > > >> > > > > > > > documented.
> > > > > > > > > > > >> > > > > > > > >          1. Hugo (static site
> > generator)
> > > > > > exists,
> > > > > > > > but
> > > > > > > > > > > >> appears
> > > > > > > > > > > >> > to
> > > > > > > > > > > >> > > > > only
> > > > > > > > > > > >> > > > > > be
> > > > > > > > > > > >> > > > > > > > >          used for the landing pages
> > > > > > > > > > > >> > > > > > > > >          2. In order to view any
> > > > > documentation
> > > > > > > > > locally
> > > > > > > > > > > >> other
> > > > > > > > > > > >> > > than
> > > > > > > > > > > >> > > > > the
> > > > > > > > > > > >> > > > > > > > >          landing pages, you'll need
> to
> > > run
> > > > > the
> > > > > > > > > site.sh
> > > > > > > > > > > >> script
> > > > > > > > > > > >> > > > then
> > > > > > > > > > > >> > > > > > > > > copy the output
> > > > > > > > > > > >> > > > > > > > >          from one dir to another?
> > > > > > > > > > > >> > > > > > > > >       3. All of the archived docs
> are
> > > raw
> > > > > > HTML,
> > > > > > > > > making
> > > > > > > > > > > >> > > migrating
> > > > > > > > > > > >> > > > > to a
> > > > > > > > > > > >> > > > > > > > >       static site generator a
> > > significant
> > > > > > > > challenge,
> > > > > > > > > > > which
> > > > > > > > > > > >> > > makes
> > > > > > > > > > > >> > > > it
> > > > > > > > > > > >> > > > > > > > > difficult to
> > > > > > > > > > > >> > > > > > > > >       prevent the archived docs from
> > > > > > continuing
> > > > > > > to
> > > > > > > > > > grow
> > > > > > > > > > > >> and
> > > > > > > > > > > >> > > grow.
> > > > > > > > > > > >> > > > > > > > > Perhaps this is the
> > > > > > > > > > > >> > > > > > > > >       wheel Khaleesi was referring
> to
> > > > > > > > > > > >> > > > > > > > >       <
> > > > > > > > https://www.youtube.com/watch?v=J-rxmk6zPxA
> > > > > > > > > >?
> > > > > > > > > > > >> > > > > > > > >    2. airflow
> > > > > > > > > > > >> > > > > > > > >       1. Building Airflow docs is a
> > > > > challenge.
> > > > > > > It
> > > > > > > > > > takes
> > > > > > > > > > > >> > several
> > > > > > > > > > > >> > > > > > minutes
> > > > > > > > > > > >> > > > > > > > and
> > > > > > > > > > > >> > > > > > > > >       doesn't support auto-build, so
> > the
> > > > > > > slightest
> > > > > > > > > > issue
> > > > > > > > > > > >> > could
> > > > > > > > > > > >> > > > > > require
> > > > > > > > > > > >> > > > > > > > > waiting
> > > > > > > > > > > >> > > > > > > > >       again and again until the
> > changes
> > > > are
> > > > > > just
> > > > > > > > > so. I
> > > > > > > > > > > >> tried
> > > > > > > > > > > >> > > > > > > implementing
> > > > > > > > > > > >> > > > > > > > >       sphinx-autobuild <
> > > > > > > > > > > >> > > > > > > > >
> > > > > > > > https://github.com/executablebooks/sphinx-autobuild
> > > > > > > > > >
> > > > > > > > > > > >> > > > > > > > >       to no avail.
> > > > > > > > > > > >> > > > > > > > >       2. Sphinx/restructured text
> has
> > a
> > > > > steep
> > > > > > > > > learning
> > > > > > > > > > > >> curve.
> > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > >> > > > > > > > > *The most acute issue: disk space*
> > > > > > > > > > > >> > > > > > > > > The size of the archived docs is
> > causing
> > > > the
> > > > > > > docs
> > > > > > > > > > build
> > > > > > > > > > > >> > GitHub
> > > > > > > > > > > >> > > > > Action
> > > > > > > > > > > >> > > > > > > to
> > > > > > > > > > > >> > > > > > > > > almost run out of space. From the
> > "Build
> > > > > site"
> > > > > > > > > Action
> > > > > > > > > > > >> from a
> > > > > > > > > > > >> > > > couple
> > > > > > > > > > > >> > > > > > > weeks
> > > > > > > > > > > >> > > > > > > > > ago
> > > > > > > > > > > >> > > > > > > > > <
> > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > >> > > > > > > >
> > > > > > > > > > > >> > > > > > >
> > > > > > > > > > > >> > > > > >
> > > > > > > > > > > >> > > > >
> > > > > > > > > > > >> > > >
> > > > > > > > > > > >> > >
> > > > > > > > > > > >> >
> > > > > > > > > > > >>
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/airflow-site/actions/runs/6419529645/job/17432628458
> > > > > > > > > > > >> > > > > > > > > >
> > > > > > > > > > > >> > > > > > > > > (expand
> > > > > > > > > > > >> > > > > > > > > the build site step, scroll all the
> > way
> > > to
> > > > > the
> > > > > > > > > bottom,
> > > > > > > > > > > >> expand
> > > > > > > > > > > >> > > the
> > > > > > > > > > > >> > > > > `df
> > > > > > > > > > > >> > > > > > > -h`
> > > > > > > > > > > >> > > > > > > > > command), we can see the GitHub
> Action
> > > > > runner
> > > > > > > (or
> > > > > > > > > > > whatever
> > > > > > > > > > > >> > it's
> > > > > > > > > > > >> > > > > > called)
> > > > > > > > > > > >> > > > > > > > is
> > > > > > > > > > > >> > > > > > > > > nearly running out of space:
> > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > >> > > > > > > > > df -h
> > > > > > > > > > > >> > > > > > > > >   *Filesystem      Size  Used Avail
> > Use%
> > > > > > Mounted
> > > > > > > > on*
> > > > > > > > > > > >> > > > > > > > >   /dev/root        84G   82G  2.1G
> > 98%
> > > /
> > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > >> > > > > > > > > The available space is down to 1.8G
> on
> > > the
> > > > > > most
> > > > > > > > > recent
> > > > > > > > > > > >> Action
> > > > > > > > > > > >> > > > > > > > > <
> > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > >> > > > > > > >
> > > > > > > > > > > >> > > > > > >
> > > > > > > > > > > >> > > > > >
> > > > > > > > > > > >> > > > >
> > > > > > > > > > > >> > > >
> > > > > > > > > > > >> > >
> > > > > > > > > > > >> >
> > > > > > > > > > > >>
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/airflow-site/actions/runs/6564727255/job/17831714176
> > > > > > > > > > > >> > > > > > > > > >.
> > > > > > > > > > > >> > > > > > > > > If we assume that trend is accurate,
> > we
> > > > have
> > > > > > > about
> > > > > > > > > two
> > > > > > > > > > > >> months
> > > > > > > > > > > >> > > > > before
> > > > > > > > > > > >> > > > > > > the
> > > > > > > > > > > >> > > > > > > > > Action runner runs out of disk
> space.
> > > > > Here's a
> > > > > > > > > > breakdown
> > > > > > > > > > > >> of
> > > > > > > > > > > >> > the
> > > > > > > > > > > >> > > > > space
> > > > > > > > > > > >> > > > > > > > > consumed by the 10 largest package
> > > > > > documentation
> > > > > > > > > > > >> directories:
> > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > >> > > > > > > > > du -h -d 1 docs-archive/ | sort -h
> -r
> > > > > > > > > > > >> > > > > > > > > * 14G* docs-archive/
> > > > > > > > > > > >> > > > > > > > > *4.0G*
> > > > > > > > docs-archive//apache-airflow-providers-google
> > > > > > > > > > > >> > > > > > > > > *3.2G* docs-archive//apache-airflow
> > > > > > > > > > > >> > > > > > > > > *1.7G*
> > > > > > > > docs-archive//apache-airflow-providers-amazon
> > > > > > > > > > > >> > > > > > > > > *560M*
> > > > > > > > > > > >>
> docs-archive//apache-airflow-providers-microsoft-azure
> > > > > > > > > > > >> > > > > > > > > *254M*
> > > > > > > > > > > >>
> docs-archive//apache-airflow-providers-cncf-kubernetes
> > > > > > > > > > > >> > > > > > > > > *192M*
> > > > > > > > > > > docs-archive//apache-airflow-providers-apache-hive
> > > > > > > > > > > >> > > > > > > > > *153M*
> > > > > > > > > > docs-archive//apache-airflow-providers-snowflake
> > > > > > > > > > > >> > > > > > > > > *139M*
> > > > > > > > > > docs-archive//apache-airflow-providers-databricks
> > > > > > > > > > > >> > > > > > > > > *104M*
> > > > > > > > docs-archive//apache-airflow-providers-docker
> > > > > > > > > > > >> > > > > > > > > *101M*
> > > > > > > > docs-archive//apache-airflow-providers-mysql
> > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > >> > > > > > > > > *Proposed solution: Archive old docs
> > > html
> > > > > for
> > > > > > > > large
> > > > > > > > > > > >> packages
> > > > > > > > > > > >> > to
> > > > > > > > > > > >> > > > > cloud
> > > > > > > > > > > >> > > > > > > > > storage*
> > > > > > > > > > > >> > > > > > > > > I'm wondering if it would be
> > reasonable
> > > to
> > > > > > truly
> > > > > > > > > > archive
> > > > > > > > > > > >> the
> > > > > > > > > > > >> > > docs
> > > > > > > > > > > >> > > > > for
> > > > > > > > > > > >> > > > > > > > some
> > > > > > > > > > > >> > > > > > > > > of the older versions of these
> > packages.
> > > > > > Perhaps
> > > > > > > > the
> > > > > > > > > > > last
> > > > > > > > > > > >> 18
> > > > > > > > > > > >> > > > > months?
> > > > > > > > > > > >> > > > > > > > Maybe
> > > > > > > > > > > >> > > > > > > > > we could drop the html in a blob
> > storage
> > > > > > bucket
> > > > > > > > with
> > > > > > > > > > > >> > > instructions
> > > > > > > > > > > >> > > > > for
> > > > > > > > > > > >> > > > > > > > > building the docs if absolutely
> > > necessary?
> > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > >> > > > > > > > > *Improving docs building moving
> > forward*
> > > > > > > > > > > >> > > > > > > > > There's an open Issue <
> > > > > > > > > > > >> > > > > > >
> > > > > https://github.com/apache/airflow-site/issues/719
> > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > >> > > > > > > > > for
> > > > > > > > > > > >> > > > > > > > > migrating the docs to a framework,
> but
> > > > it's
> > > > > > not
> > > > > > > at
> > > > > > > > > > all a
> > > > > > > > > > > >> > > > > > > straightforward
> > > > > > > > > > > >> > > > > > > > > task for the archived docs. I think
> > that
> > > > we
> > > > > > > should
> > > > > > > > > > > >> institute
> > > > > > > > > > > >> > a
> > > > > > > > > > > >> > > > > policy
> > > > > > > > > > > >> > > > > > > of
> > > > > > > > > > > >> > > > > > > > > archiving old documentation to cloud
> > > > storage
> > > > > > > > after X
> > > > > > > > > > > time
> > > > > > > > > > > >> and
> > > > > > > > > > > >> > > > use a
> > > > > > > > > > > >> > > > > > > > > framework for building docs in a
> > > scalable
> > > > > and
> > > > > > > > > > > sustainable
> > > > > > > > > > > >> way
> > > > > > > > > > > >> > > > > moving
> > > > > > > > > > > >> > > > > > > > > forward. Maybe we could chat with
> > > iceberg
> > > > > > folks
> > > > > > > > > about
> > > > > > > > > > > how
> > > > > > > > > > > >> > they
> > > > > > > > > > > >> > > > > moved
> > > > > > > > > > > >> > > > > > > from
> > > > > > > > > > > >> > > > > > > > > mkdocs to hugo? <
> > > > > > > > > > > >> > https://github.com/apache/iceberg/issues/3616
> > > > > > > > > > > >> > > >
> > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > >> > > > > > > > > Shoutout to Utkarsh for helping me
> > > through
> > > > > all
> > > > > > > > this!
> > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > >> > > > > > > >
> > > > > > > > > > > >> > > > > > >
> > > > > > > > > > > >> > > > > >
> > > > > > > > > > > >> > > > >
> > > > > > > > > > > >> > > >
> > > > > > > > > > > >> > >
> > > > > > > > > > > >> >
> > > > > > > > > > > >>
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > >
> > > --
> > > Regards
> > >
> > > Bowrna Prabhakaran
> > >
> >
>

Reply via email to