Hi Kent,

Can you if possible please provide a heuristic estimate of storage
reduction that will be achieved  through this approach?

Thanks

Mich Talebzadeh,

Architect | Data Engineer | Data Science | Financial Crime
PhD <https://en.wikipedia.org/wiki/Doctor_of_Philosophy> Imperial College
London <https://en.wikipedia.org/wiki/Imperial_College_London>
London, United Kingdom


   view my Linkedin profile
<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>


 https://en.everybodywiki.com/Mich_Talebzadeh



*Disclaimer:* The information provided is correct to the best of my
knowledge but of course cannot be guaranteed . It is essential to note
that, as with any advice, quote "one test result is worth one-thousand
expert opinions (Werner  <https://en.wikipedia.org/wiki/Wernher_von_Braun>Von
Braun <https://en.wikipedia.org/wiki/Wernher_von_Braun>)".


On Mon, 12 Aug 2024 at 14:55, Mich Talebzadeh <mich.talebza...@gmail.com>
wrote:

> Hello,
>
> On the face of it, this email contains many references, making it
> difficult to follow. Perhaps, a simpler explanation could improve voting
> participation.
>
> The STAR methodology can be helpful in understanding and evaluating this
> proposal. STAR stands for Situation, Task, Action, Result.
>
> Let us have a look at this
>
> *S*ituation:
>
>    - The Spark website repository is reaching its storage limit on
>    GitHub-hosted runners.
>
> *T*ask:
>
>    - Reduce storage usage without compromising access to documentation.
>
> *A*ction:(proposed)
>
>    - Move documentation releases from the dev directory to the
>    release directory within the Apache distribution.
>    - Leverage the Apache Archives service to create permanent links for
>    the documentation.
>    - Upload older website-hosted documentation manually via SVN.
>    - Optionally, delete old documentation and update links/use
>    redirection as needed.
>
> *Result:*
>
>    - Reduced storage usage on GitHub-hosted runners.
>    - Permanent, publicly accessible links for Spark documentation via the
>    Apache Archives.
>    - Potential need for manual upload of older documentation and link
>    updates.
>
>
> Consider including an estimated storage reduction achieved through this
> approach.
> Overall, the proposal offers a viable solution for managing Spark
> documentation while reducing storage concerns. However, addressing the
> potential complexity of managing older documentation versions is crucial.
>
> +1 for me
>
> Mich Talebzadeh,
>
> Architect | Data Engineer | Data Science | Financial Crime
> PhD <https://en.wikipedia.org/wiki/Doctor_of_Philosophy> Imperial College
> London <https://en.wikipedia.org/wiki/Imperial_College_London>
> London, United Kingdom
>
>
>    view my Linkedin profile
> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>
>
>  https://en.everybodywiki.com/Mich_Talebzadeh
>
>
>
> *Disclaimer:* The information provided is correct to the best of my
> knowledge but of course cannot be guaranteed . It is essential to note
> that, as with any advice, quote "one test result is worth one-thousand
> expert opinions (Werner  <https://en.wikipedia.org/wiki/Wernher_von_Braun>Von
> Braun <https://en.wikipedia.org/wiki/Wernher_von_Braun>)".
>
>
> On Mon, 12 Aug 2024 at 10:09, Kent Yao <y...@apache.org> wrote:
>
>> Archive Spark Documentations in Apache Archives
>>
>> Hi dev,
>>
>> To address the issue of the Spark website repository size
>> reaching the storage limit for GitHub-hosted runners [1], I suggest
>> enhancing step [2] in our release process by relocating the
>> documentation releases from the dev[3] directory to the release
>> directory[4]. Then it would captured by the Apache Archives
>> service[5] to create permanent links, which would be alternative
>> endpoints for our documentation, like
>>
>>
>> https://dist.apache.org/repos/dist/dev/spark/v3.5.2-rc5-docs/_site/index.html
>> for
>> https://spark.apache.org/docs/3.5.2/index.html
>>
>> Note that the previous example still uses the staging repository,
>> which will become
>> https://archive.apache.org/dist/spark/docs/3.5.2/index.html.
>>
>> For older releases hosted on the Spark website [6], we also need to
>> upload them via SVN manually.
>>
>> After that, when we reach the threshold again, we can delete some of
>> the old ones on page [6], and update their links on page [7] or use
>> redirection.
>>
>> JIRA ticket: https://issues.apache.org/jira/browse/SPARK-49209
>>
>> Please vote on the idea of  Archive Spark Documentations in
>> Apache Archives for the next 72 hours:
>>
>> [ ] +1: Accept the proposal
>> [ ] +0
>> [ ] -1: I don’t think this is a good idea because …
>>
>> Bests,
>> Kent Yao
>>
>> [1] https://lists.apache.org/thread/o0w4gqoks23xztdmjjj26jkp1yyg2bvq
>> [2]
>> https://spark.apache.org/release-process.html#upload-to-apache-release-directory
>> [3] https://dist.apache.org/repos/dist/dev/spark/v3.5.2-rc5-docs/
>> [4] https://dist.apache.org/repos/dist/release/spark/docs/3.5.2
>> [5] https://archive.apache.org/dist/spark/
>> [6] https://github.com/apache/spark-website/tree/asf-site/site/docs
>> [7] https://spark.apache.org/documentation.html
>>
>> ---------------------------------------------------------------------
>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>
>>

Reply via email to