+1 (non-binding) On Tue, Aug 13, 2024 at 6:16 AM Gengliang Wang <ltn...@gmail.com> wrote:
> +1 > > On Mon, Aug 12, 2024 at 2:01 PM Xiao Li <gatorsm...@gmail.com> wrote: > >> +1 >> >> Mich Talebzadeh <mich.talebza...@gmail.com> 于2024年8月12日周一 13:11写道: >> >>> Hi Kent, >>> >>> Can you if possible please provide a heuristic estimate of storage >>> reduction that will be achieved through this approach? >>> >>> Thanks >>> >>> Mich Talebzadeh, >>> >>> Architect | Data Engineer | Data Science | Financial Crime >>> PhD <https://en.wikipedia.org/wiki/Doctor_of_Philosophy> Imperial >>> College London <https://en.wikipedia.org/wiki/Imperial_College_London> >>> London, United Kingdom >>> >>> >>> view my Linkedin profile >>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> >>> >>> >>> https://en.everybodywiki.com/Mich_Talebzadeh >>> >>> >>> >>> *Disclaimer:* The information provided is correct to the best of my >>> knowledge but of course cannot be guaranteed . It is essential to note >>> that, as with any advice, quote "one test result is worth one-thousand >>> expert opinions (Werner >>> <https://en.wikipedia.org/wiki/Wernher_von_Braun>Von Braun >>> <https://en.wikipedia.org/wiki/Wernher_von_Braun>)". >>> >>> >>> On Mon, 12 Aug 2024 at 14:55, Mich Talebzadeh <mich.talebza...@gmail.com> >>> wrote: >>> >>>> Hello, >>>> >>>> On the face of it, this email contains many references, making it >>>> difficult to follow. Perhaps, a simpler explanation could improve voting >>>> participation. >>>> >>>> The STAR methodology can be helpful in understanding and evaluating >>>> this proposal. STAR stands for Situation, Task, Action, Result. >>>> >>>> Let us have a look at this >>>> >>>> *S*ituation: >>>> >>>> - The Spark website repository is reaching its storage limit on >>>> GitHub-hosted runners. >>>> >>>> *T*ask: >>>> >>>> - Reduce storage usage without compromising access to documentation. >>>> >>>> *A*ction:(proposed) >>>> >>>> - Move documentation releases from the dev directory to the >>>> release directory within the Apache distribution. >>>> - Leverage the Apache Archives service to create permanent links >>>> for the documentation. >>>> - Upload older website-hosted documentation manually via SVN. >>>> - Optionally, delete old documentation and update links/use >>>> redirection as needed. >>>> >>>> *Result:* >>>> >>>> - Reduced storage usage on GitHub-hosted runners. >>>> - Permanent, publicly accessible links for Spark documentation via >>>> the Apache Archives. >>>> - Potential need for manual upload of older documentation and link >>>> updates. >>>> >>>> >>>> Consider including an estimated storage reduction achieved through this >>>> approach. >>>> Overall, the proposal offers a viable solution for managing Spark >>>> documentation while reducing storage concerns. However, addressing the >>>> potential complexity of managing older documentation versions is crucial. >>>> >>>> +1 for me >>>> >>>> Mich Talebzadeh, >>>> >>>> Architect | Data Engineer | Data Science | Financial Crime >>>> PhD <https://en.wikipedia.org/wiki/Doctor_of_Philosophy> Imperial >>>> College London <https://en.wikipedia.org/wiki/Imperial_College_London> >>>> London, United Kingdom >>>> >>>> >>>> view my Linkedin profile >>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> >>>> >>>> >>>> https://en.everybodywiki.com/Mich_Talebzadeh >>>> >>>> >>>> >>>> *Disclaimer:* The information provided is correct to the best of my >>>> knowledge but of course cannot be guaranteed . It is essential to note >>>> that, as with any advice, quote "one test result is worth one-thousand >>>> expert opinions (Werner >>>> <https://en.wikipedia.org/wiki/Wernher_von_Braun>Von Braun >>>> <https://en.wikipedia.org/wiki/Wernher_von_Braun>)". >>>> >>>> >>>> On Mon, 12 Aug 2024 at 10:09, Kent Yao <y...@apache.org> wrote: >>>> >>>>> Archive Spark Documentations in Apache Archives >>>>> >>>>> Hi dev, >>>>> >>>>> To address the issue of the Spark website repository size >>>>> reaching the storage limit for GitHub-hosted runners [1], I suggest >>>>> enhancing step [2] in our release process by relocating the >>>>> documentation releases from the dev[3] directory to the release >>>>> directory[4]. Then it would captured by the Apache Archives >>>>> service[5] to create permanent links, which would be alternative >>>>> endpoints for our documentation, like >>>>> >>>>> >>>>> https://dist.apache.org/repos/dist/dev/spark/v3.5.2-rc5-docs/_site/index.html >>>>> for >>>>> https://spark.apache.org/docs/3.5.2/index.html >>>>> >>>>> Note that the previous example still uses the staging repository, >>>>> which will become >>>>> https://archive.apache.org/dist/spark/docs/3.5.2/index.html. >>>>> >>>>> For older releases hosted on the Spark website [6], we also need to >>>>> upload them via SVN manually. >>>>> >>>>> After that, when we reach the threshold again, we can delete some of >>>>> the old ones on page [6], and update their links on page [7] or use >>>>> redirection. >>>>> >>>>> JIRA ticket: https://issues.apache.org/jira/browse/SPARK-49209 >>>>> >>>>> Please vote on the idea of Archive Spark Documentations in >>>>> Apache Archives for the next 72 hours: >>>>> >>>>> [ ] +1: Accept the proposal >>>>> [ ] +0 >>>>> [ ] -1: I don’t think this is a good idea because … >>>>> >>>>> Bests, >>>>> Kent Yao >>>>> >>>>> [1] https://lists.apache.org/thread/o0w4gqoks23xztdmjjj26jkp1yyg2bvq >>>>> [2] >>>>> https://spark.apache.org/release-process.html#upload-to-apache-release-directory >>>>> [3] https://dist.apache.org/repos/dist/dev/spark/v3.5.2-rc5-docs/ >>>>> [4] https://dist.apache.org/repos/dist/release/spark/docs/3.5.2 >>>>> [5] https://archive.apache.org/dist/spark/ >>>>> [6] https://github.com/apache/spark-website/tree/asf-site/site/docs >>>>> [7] https://spark.apache.org/documentation.html >>>>> >>>>> --------------------------------------------------------------------- >>>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >>>>> >>>>>