@Kou, automated commits / PRs to trigger Appveyor/Travis CI are most likely part of the solution, but there are other issues.
We should start a Google document or something to enumerate all the things we want to implement and identify the technical issues with each thing. For example, at present, among other things: * Building a newer version of arrow-dist requires manual tweaks to many files * Packages are deployed from different GitHub repos (apache/arrow-dist _and_ wesm/arrow-dist, due to a quirk with Appveyor) * Packages must be manually downloaded from BinTray (by clicking around a web UI) after a successful run At a high level goal, we need to figure out how to prevent the disaster of the last 7 days from occurring ever again. Easily 10-20 hours of wasted time this past week on packaging issues, and some things are still broken -- this is not acceptable to me. It's like the frog slowly boiling in water; we have allowed this technical debt to accumulate, and we should not allow this to continue. It does not seem that many of the issues we are having are related to or caused by CMake, so I'd like to table the build system discussion until we identify solutions to the other problems. - Wes On Sun, Mar 25, 2018 at 9:38 AM, Kouhei Sutou <k...@clear-code.com> wrote: > Hi, > > How about creating a pull request to apache/arrow-dist with > changes to use the latest apache/arrow? (A sample script > exists at the end.) > > If the pull request is created, we can test packaging with > the latest apache/arrow on Travis CI. If we add the > following configuration to .travis.yml in apache/arrow-dist, > we can get a notification via e-mail on failure: > > --- > notifications: > email: > recipients: > - build-fail...@arrow.apache.org # or something > # See > https://docs.travis-ci.com/user/notifications/#Changing-notification-frequency > # for details > on_success: never # or change > on_failure: always > --- > > If the pull request is succeeded, we just close the pull > request. (It should be automated.) Or we merge it into > master and publish daily packages. > > > --- > #!/usr/bin/env ruby > > require "octokit" > > github_token = ENV["GITHUB_TOKEN"] > if github_token.nil? > $stderr.puts("Must specify GitHub access token by GITHUB_TOKEN environment > variable") > exit(false) > end > client = Octokit::Client.new(:access_token => github_token) > > now = Time.now > branch = now.strftime("update-%Y-%m-%d") > message = now.strftime("Update at %Y-%m-%d") > > def run(*command_line) > IO.pipe do |input, output| > expanded_command_line = command_line.join(" ") > puts(expanded_command_line) > response = nil > begin > read_thread = Thread.new do > response = input.read > end > unless system(*command_line, :out => output) > raise "failed to run: #{expanded_command_line}" > end > output.close > ensure > read_thread.join > end > response > end > end > > run("git", "fetch", "--all", "--prune") > run("git", "checkout", "master") > run("git", "rebase", "upstream/master") > > begin > run("git", "checkout", "-b", branch) > Dir.chdir("arrow") do > current_version = run("git", "log", "-n1", "--format=%H", "HEAD").chomp > run("git", "checkout", "master") > run("git", "pull") > new_version = run("git", "log", "-n1", "--format=%H", "HEAD").chomp > exit(true) if current_version == new_version > end > Dir.chdir("cpp-linux") do > run("rake", "version:update") > end > # And do something for other packages > run("git", "commit", "-a", "-m", message) > run("git", "push", "origin", branch) > ensure > run("git", "checkout", "master") > run("git", "submodule", "update", "arrow") > run("git", "branch", "-D", branch) > end > > client.create_pull_request("apache/arrow-dist", "master", branch, message) > --- > > Thanks, > -- > kou > > In <CAJPUwMB9CKD-N4RgHgLZFqO6sN5fCXXv26_OjVHHW1oKQqA=s...@mail.gmail.com> > "Confronting Arrow packaging problems" on Fri, 23 Mar 2018 12:58:54 -0400, > Wes McKinney <wesmck...@gmail.com> wrote: > >> hi folks, >> >> So, I want to bring light to the problems we are having delivering >> binary artifacts after Arrow releases. >> >> We have some amount of packaging automation implemented in >> https://github.com/apache/arrow-dist using Travis CI and Appveyor to >> upload packages to Bintray, a packaging hosting service. >> >> Unfortunately, we discovered a bunch of problems with these packaging >> scripts after the release vote closed on Monday, and now 4 days later, >> we still have been unable to post binaries to >> https://pypi.python.org/pypi/pyarrow >> >> This is no one's fault, but it highlights structural problems with our >> development process: >> >> * Why does producing packages after a release require error-prone manual >> labor? >> >> * Why are we only finding out about packaging problem after a release >> vote closes? >> >> * Why is setting up nightly binary builds a brittle and bespoke process? >> >> I hope all agree that: >> >> * Packaging should not be a hardship or require a lot of manual labor >> >> * Packaging problems on the master branch should be made known within >> ~24 hours, so they can be remedied immediately >> >> * It should be straightforward to produce binary artifacts for all >> supported platforms and programming languages >> >> Eventually, we should include some binary artifacts in our release >> votes, but we are pretty far away from suitable automation to make >> this possible. >> >> I don't know any easy solutions, but Apache Arrow has grown widely >> used enough that I think it's worth our taking the time to plan and >> execute some solutions to these problems, which I expect to pay >> dividends in our community's productivity over time. >> >> Thanks, >> Wes