hi Neal, In general the improvements to the site sound good, and I agree with moving the site into the apache/arrow-site repository.
It sounds like a committer will have to volunteer a PAT for the Travis CI settings in https://travis-ci.org/apache/arrow-site/settings Even though you can't get at such an environment variable there after it's set, it could still technically be compromised. Personally I wouldn't be comfortable having a token with "repo" scope out there. We might need to think about this some more -- the general idea of making it easier to deploy the website I'm totally on board with - Wes On Fri, Aug 2, 2019 at 1:35 PM Neal Richardson <neal.p.richard...@gmail.com> wrote: > > Hi all, > https://issues.apache.org/jira/browse/ARROW-5746 requested to move the > source for https://arrow.apache.org out of `apache/arrow` due to the > growing number of binary files (mostly images) there. > > https://issues.apache.org/jira/browse/ARROW-4473 requested > improvements to the ability to make a test deploy of the website and > noted challenges/bugs in trying to do this when the site `baseurl` is > a subdirectory. > > On my fork of `arrow-site` [1] I have a solution to both. I created a > `master` branch and copied the contents of the `site/` directory in > `apache/arrow` to that, using `git filter-branch --prune-empty > --subdirectory-filter site master` to preserve the commit history [2]. > Then I added a build script [3] that gets executed by Travis-CI [4]. > > The script builds the Jekyll site and pushes it to a branch that gets > published. On `apache/arrow-site`, commits to the `master` branch > trigger a build of the Jekyll site and push the result to the > `asf-site` branch. On forks, commits to `master` build the site and > publish to the `gh-pages` branch, which can deploy to GitHub Pages. > > ## Features > > * Automatic building of the arrow.apache.org site whenever changes are > made to the Jekyll source--no manual build step required. > * Automatic building of a test site from your fork, which will enable > reviewers to verify your changes without having to build and serve > locally and trust that what works locally will work when deployed. > * Relative URL problems are fixed: links work regardless of whether > the "base URL" is top level or a subdirectory. > * Reduced size of the core `apache/arrow` repository > * Documentation publishing is not affected. Updating the contents of > the `docs/` directory in the published `asf-site` branch can continue > to happen by whatever other process. The automatic building and > publishing of the Jekyll site does not overwrite the `docs/` > directory. > > ## Usage > > Local development and serving of the Jekyll site is not affected by > this build process--it works exactly the same as before, just located > in the `arrow-site` repository instead of the `site/` directory of > `apache/arrow`. > > To enable the automatic building on your fork, there are a couple of > quick setup steps to enable GitHub Pages and Travis-CI, described here > [5]. > > In order set up the automatic deploy on `apache/arrow-site`, a > committer will need to set a GITHUB_PAT there. I imagine there could > be some hesitation to doing this, but it is safe because > > 1. Builds only happen on the master branch, and only committers can > modify the master branch, so by accepting a patch to `master`, they're > implicitly accepting a patch to `asf-site` > 2. Malicious actors can't modify the build script in a pull request > and use the token because Travis does "not provide [repository-setting > environment variables] to untrusted builds, triggered by pull requests > from another repository" [6] > 3. Non-committers cannot access the Travis-CI settings to alter the > GITHUB_PAT (and even committers cannot view the value of the token > once it is set) > 4. IIUC there is still a manual action required to get the ASF to > update arrow.apache.org with the contents of the `asf-site` branch > > While it would be useful, it is not required that we enable automatic > deploy on `apache/arrow-site` in order to get benefit from this > proposal because this enables contributors to opt-in to deploying test > sites from their forks, and those tests sites will actually work. > > Let me know if you have any questions or concerns. If there are no > objections, then to proceed I'll need a committer to create an orphan > `master` branch on `apache/arrow-site`, and then I can make a pull > request to that, which we'd want to merge without squashing in order > to preserve the git history of the site from `apache/arrow`. > > Thanks, > Neal > > [1] https://github.com/nealrichardson/arrow-site/ > [2] https://github.com/nealrichardson/arrow-site/commits/master > [3] > https://github.com/nealrichardson/arrow-site/blob/master/build-and-deploy.sh > [4] https://github.com/nealrichardson/arrow-site/blob/master/.travis.yml > [5] > https://github.com/nealrichardson/arrow-site/tree/master#previewing-the-site > [6] > https://docs.travis-ci.com/user/environment-variables/#defining-variables-in-repository-settings