I'm currently modifying the cibot to do this automatically; should be finished until Monday.

On 02/08/2019 07:41, Jark Wu wrote:
Hi Chesnay,

Can we assign Flink Committers the permission of flink-ci/flink repo?
Several times, when I pushed some new commits, the old build jobs are still
in pending and not canceled.
Before we fix that, we can manually cancel some old jobs to save build
resource.

Best,
Jark


On Wed, 10 Jul 2019 at 16:17, Chesnay Schepler <ches...@apache.org> wrote:

Your best bet would be to check the first commit in the PR and check the
parent commit.

To re-run things, you will have to rebase the PR on the latest master.

On 10/07/2019 03:32, Kurt Young wrote:
Thanks for all your efforts Chesnay, it indeed improves a lot for our
develop experience. BTW, do you know how to find the master branch
information which the CI runs with?

For example, like this one:
https://travis-ci.com/flink-ci/flink/jobs/214542568
It shows pass with the commits, which rebased on the master when the CI
is triggered. But it's both possible that the master branch CI runs on is
the
same or different with current master. If it's the same, I can simply
rely
on the
passed information to push commits, but if it's not, I think i should
find
another
way to re-trigger tests based on the newest master.

Do you know where can I get such information?

Best,
Kurt


On Tue, Jul 9, 2019 at 3:27 AM Chesnay Schepler <ches...@apache.org>
wrote:
The kinks have been worked out; the bot is running again and pr builds
are yet again no longer running on ASF resources.

PRs are mirrored to: https://github.com/flink-ci/flink
Bot source: https://github.com/flink-ci/ci-bot

On 08/07/2019 17:14, Chesnay Schepler wrote:
I have temporarily re-enabled running PR builds on the ASF account;
migrating to the Travis subscription caused some issues in the bot
that I have to fix first.

On 07/07/2019 23:01, Chesnay Schepler wrote:
The vote has passed unanimously in favor of migrating to a separate
Travis account.

I will now set things up such that no PullRequest is no longer run on
the ASF servers.
This is a major setup in reducing our usage of ASF resources.
For the time being we'll use free Travis plan for flink-ci (i.e. 5
workers, which is the same the ASF gives us). Over the course of the
next week we'll setup the Ververica subscription to increase this
limit.
  From now now, a bot will mirror all new and updated PullRequests to a
mirror repository (https://github.com/flink-ci/flink-ci) and write an
update into the PR once the build is complete.
I have ran the bots for the past 3 days in parallel to our existing
Travis and it was working without major issues.

The biggest change that contributors will see is that there's no
longer a icon next to each commit. We may revisit this in the future.

I'll setup a repo with the source of the bot later.

On 04/07/2019 10:46, Chesnay Schepler wrote:
I've raised a JIRA
<https://issues.apache.org/jira/browse/INFRA-18703>with INFRA to
inquire whether it would be possible to switch to a different Travis
account, and if so what steps would need to be taken.
We need a proper confirmation from INFRA since we are not in full
control of the flink repository (for example, we cannot access the
settings page).

If this is indeed possible, Ververica is willing sponsor a Travis
account for the Flink project.
This would provide us with more than enough resources than we need.

Since this makes the project more reliant on resources provided by
external companies I would like to vote on this.

Please vote on this proposal, as follows:
[ ] +1, Approve the migration to a Ververica-sponsored Travis
account, provided that INFRA approves
[ ] -1, Do not approach the migration to a Ververica-sponsored
Travis account

The vote will be open for at least 24h, and until we have
confirmation from INFRA. The voting period may be shorter than the
usual 3 days since our current is effectively not working.

On 04/07/2019 06:51, Bowen Li wrote:
Re: > Are they using their own Travis CI pool, or did the switch to
an entirely different CI service?

I reached out to Wes and Krisztián from Apache Arrow PMC. They are
currently moving away from ASF's Travis to their own in-house metal
machines at [1] with custom CI application at [2]. They've seen
significant improvement w.r.t both much higher performance and
basically no resource waiting time, "night-and-day" difference
quoting Wes.

Re: > If we can just switch to our own Travis pool, just for our
project, then this might be something we can do fairly quickly?

I believe so, according to [3] and [4]


[1] https://ci.ursalabs.org/ <https://ci.ursalabs.org/#/>
[2] https://github.com/ursa-labs/ursabot
[3]

https://docs.travis-ci.com/user/migrate/open-source-repository-migration
[4]

https://docs.travis-ci.com/user/migrate/open-source-on-travis-ci-com


On Wed, Jul 3, 2019 at 12:01 AM Chesnay Schepler
<ches...@apache.org <mailto:ches...@apache.org>> wrote:

      Are they using their own Travis CI pool, or did the switch to
an
      entirely different CI service?

      If we can just switch to our own Travis pool, just for our
      project, then
      this might be something we can do fairly quickly?

      On 03/07/2019 05:55, Bowen Li wrote:
      > I responded in the INFRA ticket [1] that I believe they are
      using a wrong
      > metric against Flink and the total build time is a completely
      different
      > thing than guaranteed build capacity.
      >
      > My response:
      >
      > "As mentioned above, since I started to pay attention to
Flink's
      build
      > queue a few tens of days ago, I'm in Seattle and I saw no
build
      was kicking
      > off in PST daytime in weekdays for Flink. Our teammates in
China
      and Europe
      > have also reported similar observations. So we need to
evaluate
      how the
      > large total build time came from - if 1) your number and 2)
our
      > observations from three locations that cover pretty much a
full
      day, are
      > all true, I **guess** one reason can be that - highly likely
the
      extra
      > build time came from weekends when other Apache projects may
be
      idle and
      > Flink just drains hard its congested queue.
      >
      > Please be aware of that we're not complaining about the lack
of
      resources
      > in general, I'm complaining about the lack of **stable,
dedicated**
      > resources. An example for the latter one is, currently even
if
      no build is
      > in Flink's queue and I submit a request to be the queue head
in PST
      > morning, my build won't even start in 6-8+h. That is an
absurd
      amount of
      > waiting time.
      >
      > That's saying, if ASF INFRA decides to adopt a quota system
and
      grants
      > Flink five DEDICATED servers that runs all the time only for
      Flink, that'll
      > be PERFECT and can totally solve our problem now.
      >
      > Please be aware of that we're not complaining about the lack
of
      resources
      > in general, I'm complaining about the lack of **stable,
dedicated**
      > resources. An example for the latter one is, currently even
if
      no build is
      > in Flink's queue and I submit a request to be the queue head
in PST
      > morning, my build won't even start in 6-8+h. That is an
absurd
      amount of
      > waiting time.
      >
      >
      > That's saying, if ASF INFRA decides to adopt a quota system
and
      grants
      > Flink five DEDICATED servers that runs all the time only for
      Flink, that'll
      > be PERFECT and can totally solve our problem now.
      >
      > I feel what's missing in the ASF INFRA's Travis resource
pool is
      some level
      > of build capacity SLAs and certainty"
      >
      >
      > Again, I believe there are differences in nature of these two
      problems,
      > long build time v.s. lack of dedicated build resource. That's
      saying,
      > shortening build time may relieve the situation, and may not.
      I'm sightly
      > negative on disabling IT cases for PRs, due to the downside
is
      that we are
      > at risk of any potential bugs in PR that UTs doesn't catch,
and
      may cost a
      > lot more to fix and if it slows others down or even block
      others, but am
      > open to others opinions on it.
      >
      > AFAICT from INFRA ticket[1], donating to ASF INFRA won't be
      feasible to
      > solve our problem since INFRA's pool is fully shared and they
      have no
      > control and finer insights over resource allocation to a
      specific Apache
      > project. As mentioned in [1], Apache Arrow is moving away
from
      ASF INFRA
      > Travis pool (they are actually surprised Flink hasn't plan
to do
      so). I
      > know that Spark is on its own build infra. If we all agree
that
      funding our
      > own build infra, I'd be glad to help investigate any
potential
      options
      > after releasing 1.9 since I'm super busy with 1.9 now.
      >
      > [1] https://issues.apache.org/jira/browse/INFRA-18533
      >
      >
      >
      > On Tue, Jul 2, 2019 at 4:46 AM Chesnay Schepler
      <ches...@apache.org <mailto:ches...@apache.org>> wrote:
      >
      >> As a short-term stopgap, since we can assume this issue to
      become much
      >> worse in the following days/weeks, we could disable IT
cases in
      PRs and
      >> only run them on master.
      >>
      >> On 02/07/2019 12:03, Chesnay Schepler wrote:
      >>> People really have to stop thinking that just because
      something works
      >>> for us it is also a good solution.
      >>> Also, please remember that our builds run for 2h from
start to
      finish,
      >>> and not the 14 _minutes_ it takes for zeppelin.
      >>> We are dealing with an entirely different scale here, both
in
      terms of
      >>> build times and number of builds.
      >>>
      >>> In this very thread people have been complaining about long
queue
      >>> times for their builds. Surprise, other Apache projects
have been
      >>> suffering the very same thing due to us not controlling our
build
      >>> times. While switching services (be it Jenkins, CircleCI or
      whatever)
      >>> will possibly work for us (and these options are actually
      attractive,
      >>> like CircleCI's proper support for build artifacts), it
will also
      >>> result in us likely negatively affecting other projects in
      significant
      >>> ways.
      >>>
      >>> Sure, the Jenkins setup has a good user experience for us,
at
      the cost
      >>> of blocking Jenkins workers for a _lot_ of time. Right now
we
      have 25
      >>> PR's in our queue; that's possibly 50h we'd consume of
Jenkins
      >>> resources, and the European contributors haven't even
really
      started yet.
      >>>
      >>> FYI, the latest INFRA response from INFRA-18533:
      >>>
      >>> "Our rough metrics shows that Flink used over 5800 hours of
      build time
      >>> last month. That is equal to EIGHT servers running 24/7 for
      the ENTIRE
      >>> MONTH. EIGHT. nonstop.
      >>> When we discovered this last night, we discussed it some
and
      are going
      >>> to tune down Flink to allow only five executors maximum. We
cannot
      >>> allow Flink to consume so much of a Foundation shared
resource."
      >>>
      >>> So yes, we either
      >>> a) have to heavily reduce our CI usage or
      >>> b) fund our own, either maintaining it ourselves or
donating
      to Apache.
      >>>
      >>> On 02/07/2019 05:11, Bowen Li wrote:
      >>>> By looking at the git history of the Jenkins script, its
core
      part
      >>>> was finished in March 2017 (and only two minor update in
      2017/2018),
      >>>> so it's been running for over two years now and feels like
      Zepplin
      >>>> community has been quite happy with it. @Jeff Zhang
      >>>> <mailto:zjf...@gmail.com <mailto:zjf...@gmail.com>> can
you
      share your insights and user
      >>>> experience with the Jenkins+Travis approach?
      >>>>
      >>>> Things like:
      >>>>
      >>>> - has the approach completely solved the resource capacity
      problem
      >>>> for Zepplin community? is Zepplin community happy with the
      result?
      >>>> - is the whole configuration chain stable (e.g. uptime)
enough?
      >>>> - how often do you need to maintain the Jenkins infra? how
many
      >>>> people are usually involved in maintenance and bug-fixes?
      >>>>
      >>>> The downside of this approach seems mostly to be on the
      maintenance
      >>>> to me - maintain the script and Jenkins infra.
      >>>>
      >>>> ** Having Our Own Travis-CI.com Account **
      >>>>
      >>>> Another alternative I've been thinking of is to have our
own
      >>>> travis-ci.com <http://travis-ci.com> <
http://travis-ci.com>
      account with paid dedicated
      >>>> resources. Note travis-ci.org <http://travis-ci.org>
      <http://travis-ci.org> is the free
      >>>> version and travis-ci.com <http://travis-ci.com>
      <http://travis-ci.com> is the commercial
      >>>> version. We currently use a shared resource pool managed
by
      ASK INFRA
      >>>> team on travis-ci.org <http://travis-ci.org>
      <http://travis-ci.org>, but we have no control
      >>>> over it - we can't see how it's configured, how much
      resources are
      >>>> available, how resources are allocated among Apache
projects,
      etc.
      >>>> The nice thing about having an account on travis-ci.com
      <http://travis-ci.com>
      >>>> <http://travis-ci.com> are:
      >>>>
      >>>> - relatively low cost with much better resource guarantee
      than what
      >>>> we currently have [1]: $249/month with 5 dedicated
concurrency,
      >>>> $489/month with 10 concurrency
      >>>> - low maintenance work compared to using Jenkins
      >>>> - (potentially) no migration cost according to Travis's
doc [2]
      >>>> (pending verification)
      >>>> - full control over the build capacity/configuration
compared to
      >>>> using ASF INFRA's pool
      >>>>
      >>>> I'd be surprised if we as such a vibrant community cannot
      find and
      >>>> fund $249*12=$2988 a year in exchange for a much better
developer
      >>>> experience and much higher productivity.
      >>>>
      >>>> [1] https://travis-ci.com/plans
      >>>> [2]
      >>>>
      >>

https://docs.travis-ci.com/user/migrate/open-source-repository-migration
      >>>> On Sat, Jun 29, 2019 at 8:39 AM Chesnay Schepler
      <ches...@apache.org <mailto:ches...@apache.org>
      >>>> <mailto:ches...@apache.org <mailto:ches...@apache.org>>>
wrote:
      >>>>
      >>>>      So yes, the Jenkins job keeps pulling the state from
      Travis until it
      >>>>      finishes.
      >>>>
      >>>>      Note sure I'm comfortable with the idea of using
Jenkins
      workers
      >>>>      just to
      >>>>      idle for a several hours.
      >>>>
      >>>>      On 29/06/2019 14:56, Jeff Zhang wrote:
      >>>>      > Here's what zeppelin community did, we make a
python
      script to
      >>>>      check the
      >>>>      > build status of pull request.
      >>>>      > Here's script:
      >>>>      >
https://github.com/apache/zeppelin/blob/master/travis_check.py
      >>>>      >
      >>>>      > And this is the script we used in Jenkins build
job.
      >>>>      >
      >>>>      > if [ -f "travis_check.py" ]; then
      >>>>      >    git log -n 1
      >>>>      >    STATUS=$(curl -s $BUILD_URL | grep -e "GitHub
pull
      >>>>      request.*from.*" | sed
      >>>>      > 's/.*GitHub pull request <a
      >>>>      >
href=\"\(https[^"]*\).*from[^"]*.\(https[^"]*\).*/\1
      \2/g')
      >>>>      >    AUTHOR=$(echo $STATUS | sed
's/.*[/]\(.*\)$/\1/g')
      >>>>      >    PR=$(echo $STATUS | awk '{print $1}' | sed
      >>>> 's/.*[/]\(.*\)$/\1/g')
      >>>>      >    #COMMIT=$(git log -n 1 | grep "^Merge:" | awk
      '{print $3}')
      >>>>      >    #if [ -z $COMMIT ]; then
      >>>>      >    #  COMMIT=$(curl -s
      >>>> https://api.github.com/repos/apache/zeppelin/pulls/$PR
      >>>>      > | grep -e "\"label\":" -e "\"ref\":" -e "\"sha\":"
|
      tr '\n' ' '
      >>>>      | sed
      >>>>      > 's/\(.*sha[^,]*,\)\(.*ref.*\)/\1 = \2/g' | tr =
'\n' |
      grep -v
      >>>>      "apache:" |
      >>>>      > sed 's/.*sha.[^"]*["]\([^"]*\).*/\1/g')
      >>>>      >    #fi
      >>>>      >
      >>>>      >    # get commit hash from PR
      >>>>      >    COMMIT=$(curl -s
      >>>> https://api.github.com/repos/apache/zeppelin/pulls/$PR |
      >>>>      > grep -e "\"label\":" -e "\"ref\":" -e "\"sha\":" |
tr
      '\n' ' '
      >>>> | sed
      >>>>      > 's/\(.*sha[^,]*,\)\(.*ref.*\)/\1 = \2/g' | tr =
'\n' |
      grep -v
      >>>>      "apache:" |
      >>>>      > sed 's/.*sha.[^"]*["]\([^"]*\).*/\1/g')
      >>>>      >    sleep 30 # sleep few moment to wait travis
starts
      the build
      >>>>      >    RET_CODE=0
      >>>>      >    python ./travis_check.py ${AUTHOR} ${COMMIT} ||
      RET_CODE=$?
      >>>>      >    if [ $RET_CODE -eq 2 ]; then # try with
repository
      name when
      >>>>      travis-ci is
      >>>>      > not available in the account
      >>>>      >      RET_CODE=0
      >>>>      >      AUTHOR=$(curl -s
      >>>> https://api.github.com/repos/apache/zeppelin/pulls/$PR
      >>>>      > | grep '"full_name":' | grep -v "apache/zeppelin" |
sed
      >>>>      > 's/.*[:][^"]*["]\([^/]*\).*/\1/g')
      >>>>      >    python ./travis_check.py ${AUTHOR} ${COMMIT} ||
      RET_CODE=$?
      >>>>      >    fi
      >>>>      >
      >>>>      >    if [ $RET_CODE -eq 2 ]; then # fail with can't
find
      build
      >>>>      information in
      >>>>      > the travis
      >>>>      >      set +x
      >>>>      >      echo
      "-----------------------------------------------------"
      >>>>      >      echo "Looks like travis-ci is not configured
for
      your fork."
      >>>>      >      echo "Please setup by swich on 'zeppelin'
      repository at
      >>>>      > https://travis-ci.org/profile and travis-ci."
      >>>>      >      echo "And then make sure 'Build branch
updates'
      option is
      >>>>      enabled in
      >>>>      > the settings
      https://travis-ci.org/${AUTHOR}/zeppelin/settings
<https://travis-ci.org/$%7BAUTHOR%7D/zeppelin/settings>
      >>>> <https://travis-ci.org/$%7BAUTHOR%7D/zeppelin/settings>."
      >>>>      >      echo ""
      >>>>      >      echo "To trigger CI after setup, you will need
      ammend your
      >>>>      last commit
      >>>>      > with"
      >>>>      >      echo "git commit --amend"
      >>>>      >      echo "git push your-remote HEAD --force"
      >>>>      >      echo ""
      >>>>      >      echo "See
      >>>>      >
      >>>>
      >>

http://zeppelin.apache.org/contribution/contributions.html#continuous-integration
      >>>>      > ."
      >>>>      >    fi
      >>>>      >
      >>>>      >    exit $RET_CODE
      >>>>      > else
      >>>>      >    set +x
      >>>>      >    echo "travis_check.py does not exists"
      >>>>      >    exit 1
      >>>>      > fi
      >>>>      >
      >>>>      > Chesnay Schepler <ches...@apache.org
      <mailto:ches...@apache.org>
      >>>>      <mailto:ches...@apache.org <mailto:
ches...@apache.org
      于2019年6月29日周六 下午3:17写道:
      >>>>      >
      >>>>      >> Does this imply that a Jenkins job is active as
long
      as the
      >>>>      Travis build
      >>>>      >> runs?
      >>>>      >>
      >>>>      >> On 26/06/2019 21:28, Bowen Li wrote:
      >>>>      >>> Hi,
      >>>>      >>>
      >>>>      >>> @Dawid, I think the "long test running" as I
      mentioned in the
      >>>>      first
      >>>>      >> email,
      >>>>      >>> also as you guys said, belongs to "a big effort
      which is much
      >>>>      harder to
      >>>>      >>> accomplish in a short period of time and may
deserve
      its own
      >>>>      separate
      >>>>      >>> discussion". Thus I didn't include it in what we
can
      do in a
      >>>>      foreseeable
      >>>>      >>> short term.
      >>>>      >>>
      >>>>      >>> Besides, I don't think that's the ultimate reason
      for lack of
      >>>>      build
      >>>>      >>> resources. Even if the build is shortened to
      something like
      >>>>      2h, the
      >>>>      >>> problems of no build machine works about 6 or
more
      hours in
      >>>>      PST daytime
      >>>>      >>> that I described will still happen, because no
      machine from
      >>>>      ASF INFRA's
      >>>>      >>> pool is allocated to Flink. As I have paid close
      attention to
      >>>>      the build
      >>>>      >>> queue in the past few weekdays, it's a pretty
clear
      pattern now.
      >>>>      >>>
      >>>>      >>> **The ultimate root cause** for that is - we
don't
      have any
      >>>>      **dedicated**
      >>>>      >>> build resources that we can stably rely on. I'm
      actually ok to
      >>>>      wait for a
      >>>>      >>> long time if there are build requests running, it
      means at
      >>>>      least we are
      >>>>      >>> making progress. But I'm not ok with no build
      resource. A
      >>>>      better place I
      >>>>      >>> think we should aim at in short term is to always
      have at
      >>>>      least a central
      >>>>      >>> pool (can be 3 or 5) of machines dedicated to
build
      Flink at
      >>>>      any time, or
      >>>>      >>> maybe use users resources.
      >>>>      >>>
      >>>>      >>> @Chesnay @Robert I synced with Jeff offline that
      Zeppelin
      >>>>      community is
      >>>>      >>> using a Jenkins job to automatically build on
users'
      travis
      >>>>      account and
      >>>>      >>> link the result back to github PR. I guess the
      Jenkins job
      >>>>      would fetch
      >>>>      >>> latest upstream master and build the PR against
it.
      Jeff has
      >>>> filed
      >>>>      >> tickets
      >>>>      >>> to learn and get access to the Jenkins infra.
It'll
      better to
      >>>>      fully
      >>>>      >>> understand it first before judging this approach.
      >>>>      >>>
      >>>>      >>> I also heard good things about CircleCI, and ASF
      INFRA seems
      >>>>      to have a
      >>>>      >> pool
      >>>>      >>> of build capacity there too. Can be an
alternative
      to consider.
      >>>>      >>>
      >>>>      >>>
      >>>>      >>>
      >>>>      >>>
      >>>>      >>>
      >>>>      >>>
      >>>>      >>>
      >>>>      >>>
      >>>>      >>>
      >>>>      >>> On Wed, Jun 26, 2019 at 12:44 AM Dawid
Wysakowicz <
      >>>>      >> dwysakow...@apache.org
      <mailto:dwysakow...@apache.org> <mailto:dwysakow...@apache.org
      <mailto:dwysakow...@apache.org>>>
      >>>>      >>> wrote:
      >>>>      >>>
      >>>>      >>>> Sorry to jump in late, but I think Bowen missed
the
      most
      >>>>      important point
      >>>>      >>>> from Chesnay's previous message in the summary.
The
      ultimate
      >>>>      reason for
      >>>>      >>>> all the problems is that the tests take close
to 2
      hours to
      >>>>      run already.
      >>>>      >>>> I fully support this claim: "Unless people start
      caring about
      >>>>      test times
      >>>>      >>>> before adding them, this issue cannot be solved"
      >>>>      >>>>
      >>>>      >>>> This is also another reason why using user's
Travis
      account
      >>>>      won't help.
      >>>>      >>>> Every few weeks we reach the user's time limit
for
      a single
      >>>>      profile.
      >>>>      >>>> This makes the user's builds simply fail, until
we
      either
      >>>>      properly
      >>>>      >>>> decrease the time the tests take (which I am not
      sure we ever
      >>>>      did) or
      >>>>      >>>> postpone the problem by splitting into more
      profiles. (Note
      >>>>      that the ASF
      >>>>      >>>> Travis account has higher time limits)
      >>>>      >>>>
      >>>>      >>>> Best,
      >>>>      >>>>
      >>>>      >>>> Dawid
      >>>>      >>>>
      >>>>      >>>> On 26/06/2019 09:36, Robert Metzger wrote:
      >>>>      >>>>> Do we know if using "the best" available
hardware
      would
      >>>>      improve the
      >>>>      >> build
      >>>>      >>>>> times?
      >>>>      >>>>> Imagine we would run the build on machines with
      plenty of
      >>>>      main memory
      >>>>      >> to
      >>>>      >>>>> mount everything to ramdisk + the latest CPU
      architecture?
      >>>>      >>>>>
      >>>>      >>>>> Throwing hardware at the problem could help
reduce
      the time
      >>>>      of an
      >>>>      >>>>> individual build, and using our own
infrastructure
      would
      >>>>      remove our
      >>>>      >>>>> dependency on Apache's Travis account (with the
      obvious
      >>>>      downside of
      >>>>      >>>> having
      >>>>      >>>>> to maintain the infrastructure)
      >>>>      >>>>> We could use an open source travis
alternative, to
      have a
      >>>>      similar
      >>>>      >>>>> experience and make the migration easy.
      >>>>      >>>>>
      >>>>      >>>>>
      >>>>      >>>>> On Wed, Jun 26, 2019 at 9:34 AM Chesnay
Schepler
      >>>>      <ches...@apache.org <mailto:ches...@apache.org>
      <mailto:ches...@apache.org <mailto:ches...@apache.org>>>
      >>>>      >>>> wrote:
      >>>>      >>>>>> >From what I gathered, there's no special
      sauce that the
      >>>>      Zeppelin
      >>>>      >>>>>> project uses which actually integrates a users
Travis
      >>>>      account into the
      >>>>      >>>> PR.
      >>>>      >>>>>> They just disabled Travis for PRs. And that's
      kind of it.
      >>>>      >>>>>>
      >>>>      >>>>>> Naturally we can do this (duh) and safe the
ASF a
      fair
      >>>>      amount of
      >>>>      >>>>>> resources, but there are downsides:
      >>>>      >>>>>>
      >>>>      >>>>>> The discoverability of the Travis check takes
a
      nose-dive.
      >>>>      Either we
      >>>>      >>>>>> require every contributor to always, an every
      commit, also
      >>>>      post a
      >>>>      >> Travis
      >>>>      >>>>>> build, or we have the reviewer sift through
the
      >>>>      contributors account
      >>>>      >> to
      >>>>      >>>>>> find it.
      >>>>      >>>>>>
      >>>>      >>>>>> This is rather cumbersome. Additionally, it's
      also not
      >>>>      equivalent to
      >>>>      >>>>>> having a PR build.
      >>>>      >>>>>>
      >>>>      >>>>>> A normal branch build takes a branch as is and
      tests it. A
      >>>>      PR build
      >>>>      >>>>>> merges the branch into master, and then runs
it.
      (Fun fact:
      >>>>      This is
      >>>>      >> why
      >>>>      >>>>>> a PR without merge conflicts is not being run
on
      Travis.)
      >>>>      >>>>>>
      >>>>      >>>>>> And ultimately, everyone can already make use
of this
      >>>>      approach anyway.
      >>>>      >>>>>>
      >>>>      >>>>>> On 25/06/2019 08:02, Jark Wu wrote:
      >>>>      >>>>>>> Hi Jeff,
      >>>>      >>>>>>>
      >>>>      >>>>>>> Thanks for sharing the Zeppelin approach. I
      think it's a
      >>>>      good idea to
      >>>>      >>>>>>> leverage user's travis account.
      >>>>      >>>>>>> In this way, we can have almost unlimited
      concurrent build
      >>>>      jobs and
      >>>>      >>>>>>> developers can restart build by themselves
      (currently only
      >>>>      committers
      >>>>      >>>>>>> can restart PR's build).
      >>>>      >>>>>>>
      >>>>      >>>>>>> But I'm still not very clear how to integrate
user's
      >>>>      travis build
      >>>>      >> into
      >>>>      >>>>>>> the Flink pull request's build automatically.
      Can you
      >>>>      explain more in
      >>>>      >>>>>>> detail?
      >>>>      >>>>>>>
      >>>>      >>>>>>> Another question: does travis only build
      branches for user
      >>>>      account?
      >>>>      >>>>>>> My concern is that builds for PRs will rebase
user's
      >>>>      commits against
      >>>>      >>>>>>> current master branch.
      >>>>      >>>>>>> This will help us to find problems before
      merge.  Builds
      >>>>      for branches
      >>>>      >>>>>>> will lose the impact of new commits in
master.
      >>>>      >>>>>>> How does Zeppelin solve this problem?
      >>>>      >>>>>>>
      >>>>      >>>>>>> Thanks again for sharing the idea.
      >>>>      >>>>>>>
      >>>>      >>>>>>> Regards,
      >>>>      >>>>>>> Jark
      >>>>      >>>>>>>
      >>>>      >>>>>>> On Tue, 25 Jun 2019 at 11:01, Jeff Zhang
      <zjf...@gmail.com <mailto:zjf...@gmail.com>
      >>>>      <mailto:zjf...@gmail.com <mailto:zjf...@gmail.com>>
      >>>>      >>>>>>> <mailto:zjf...@gmail.com
      <mailto:zjf...@gmail.com> <mailto:zjf...@gmail.com
      <mailto:zjf...@gmail.com>>>> wrote:
      >>>>      >>>>>>>
      >>>>      >>>>>>>  Hi Folks,
      >>>>      >>>>>>>
      >>>>      >>>>>>>  Zeppelin meet this kind of issue before, we
solve
      >>>> it by
      >>>>      >> delegating
      >>>>      >>>>>>>  each
      >>>>      >>>>>>>  one's PR build to his travis account
      (Everyone can
      >>>>      have 5 free
      >>>>      >>>>>>>  slot for
      >>>>      >>>>>>>  travis build).
      >>>>      >>>>>>>  Apache account travis build is only
triggered when
      >>>>      PR is merged.
      >>>>      >>>>>>>
      >>>>      >>>>>>>
      >>>>      >>>>>>>
      >>>>      >>>>>>>  Kurt Young <ykt...@gmail.com
      <mailto:ykt...@gmail.com>
      >>>>      <mailto:ykt...@gmail.com <mailto:ykt...@gmail.com>>
      <mailto:ykt...@gmail.com <mailto:ykt...@gmail.com>
      >>>>      <mailto:ykt...@gmail.com <mailto:ykt...@gmail.com
      >>>>      >>>>>>>  于2019年6月25日周二 上午10:16写道:
      >>>>      >>>>>>>
      >>>>      >>>>>>>  > (Forgot to cc George)
      >>>>      >>>>>>>  >
      >>>>      >>>>>>>  > Best,
      >>>>      >>>>>>>  > Kurt
      >>>>      >>>>>>>  >
      >>>>      >>>>>>>  >
      >>>>      >>>>>>>  > On Tue, Jun 25, 2019 at 10:16 AM Kurt
Young
      >>>>      <ykt...@gmail.com <mailto:ykt...@gmail.com>
      <mailto:ykt...@gmail.com <mailto:ykt...@gmail.com>>
      >>>>      >>>>>>> <mailto:ykt...@gmail.com
      <mailto:ykt...@gmail.com> <mailto:ykt...@gmail.com
      <mailto:ykt...@gmail.com>>>>
      >>>>      wrote:
      >>>>      >>>>>>>  >
      >>>>      >>>>>>>  > > Hi Bowen,
      >>>>      >>>>>>>  > >
      >>>>      >>>>>>>  > > Thanks for bringing this up. We
      actually have
      >>>>      discussed
      >>>>      >> about
      >>>>      >>>>>>>  this, and I
      >>>>      >>>>>>>  > > think Till and George have
      >>>>      >>>>>>>  > > already spend sometime investigating
      it. I have
      >>>>      cced both of
      >>>>      >>>>>>>  them, and
      >>>>      >>>>>>>  > > maybe they can share
      >>>>      >>>>>>>  > > their findings.
      >>>>      >>>>>>>  > >
      >>>>      >>>>>>>  > > Best,
      >>>>      >>>>>>>  > > Kurt
      >>>>      >>>>>>>  > >
      >>>>      >>>>>>>  > >
      >>>>      >>>>>>>  > > On Tue, Jun 25, 2019 at 10:08 AM Jark Wu
      >>>>      <imj...@gmail.com <mailto:imj...@gmail.com>
      <mailto:imj...@gmail.com <mailto:imj...@gmail.com>>
      >>>>      >>>>>>> <mailto:imj...@gmail.com
      <mailto:imj...@gmail.com> <mailto:imj...@gmail.com
      <mailto:imj...@gmail.com>>>>
      >>>>      wrote:
      >>>>      >>>>>>>  > >
      >>>>      >>>>>>>  > >> Hi Bowen,
      >>>>      >>>>>>>  > >>
      >>>>      >>>>>>>  > >> Thanks for bringing this. We also
      suffered from
      >>>>      the long
      >>>>      >>>>>>>  build time.
      >>>>      >>>>>>>  > >> I agree that we should focus on
      solving build
      >>>>      capacity
      >>>>      >>>>>>>  problem in the
      >>>>      >>>>>>>  > >> thread.
      >>>>      >>>>>>>  > >>
      >>>>      >>>>>>>  > >> My observation is there is only one
      build is
      >>>>      running, all
      >>>>      >> the
      >>>>      >>>>>>>  others
      >>>>      >>>>>>>  > >> (other
      >>>>      >>>>>>>  > >> PRs, master) are pending.
      >>>>      >>>>>>>  > >> The pricing plan[1] of travis shows
      it can
      >>>> support
      >>>>      >> concurrent
      >>>>      >>>>>>>  build
      >>>>      >>>>>>>  > jobs.
      >>>>      >>>>>>>  > >> But I don't know which plan we are
      using, might
      >>>>      be the free
      >>>>      >>>>>>>  plan for
      >>>>      >>>>>>>  > open
      >>>>      >>>>>>>  > >> source.
      >>>>      >>>>>>>  > >>
      >>>>      >>>>>>>  > >> I cc-ed Chesnay who may have some
      experience on
      >>>>      Travis.
      >>>>      >>>>>>>  > >>
      >>>>      >>>>>>>  > >> Regards,
      >>>>      >>>>>>>  > >> Jark
      >>>>      >>>>>>>  > >>
      >>>>      >>>>>>>  > >> [1]: https://travis-ci.com/plans
      >>>>      >>>>>>>  > >>
      >>>>      >>>>>>>  > >> On Tue, 25 Jun 2019 at 08:11, Bowen Li
<
      >>>>      >> bowenl...@gmail.com <mailto:bowenl...@gmail.com>
      <mailto:bowenl...@gmail.com <mailto:bowenl...@gmail.com>>
      >>>>      >>>>>>> <mailto:bowenl...@gmail.com
      <mailto:bowenl...@gmail.com>
      >>>>      <mailto:bowenl...@gmail.com
      <mailto:bowenl...@gmail.com>>>> wrote:
      >>>>      >>>>>>>  > >>
      >>>>      >>>>>>>  > >> > Hi Steven,
      >>>>      >>>>>>>  > >> >
      >>>>      >>>>>>>  > >> > I think you may not read what I
      wrote. The
      >>>>      discussion is
      >>>>      >>>> about
      >>>>      >>>>>>>  > "unstable
      >>>>      >>>>>>>  > >> > build **capacity**", in another word
      >>>>      "unstable / lack of
      >>>>      >>>> build
      >>>>      >>>>>>>  > >> resources",
      >>>>      >>>>>>>  > >> > not "unstable build".
      >>>>      >>>>>>>  > >> >
      >>>>      >>>>>>>  > >> > On Mon, Jun 24, 2019 at 4:40 PM
      Steven Wu
      >>>>      >>>>>>>  <stevenz...@gmail.com
      <mailto:stevenz...@gmail.com> <mailto:stevenz...@gmail.com
      <mailto:stevenz...@gmail.com>>
      >>>>      <mailto:stevenz...@gmail.com
      <mailto:stevenz...@gmail.com> <mailto:stevenz...@gmail.com
      <mailto:stevenz...@gmail.com>>>>
      >>>>      >>>>>>>  > wrote:
      >>>>      >>>>>>>  > >> >
      >>>>      >>>>>>>  > >> > > long and sometimes unstable build
is
      >>>>      definitely a pain
      >>>>      >>>>>> point.
      >>>>      >>>>>>>  > >> > >
      >>>>      >>>>>>>  > >> > > I suspect the build failure here in
      >>>>      >> flink-connector-kafka
      >>>>      >>>>>>>  is not
      >>>>      >>>>>>>  > >> related
      >>>>      >>>>>>>  > >> > to
      >>>>      >>>>>>>  > >> > > my change. but there is no easy
      re-run the
      >>>>      build on
      >>>>      >>>>>>>  travis UI.
      >>>>      >>>>>>>  > Google
      >>>>      >>>>>>>  > >> > > search showed a trick of
      close-and-open the
      >>>>      PR will
      >>>>      >>>>>>>  trigger rebuild.
      >>>>      >>>>>>>  > >> but
      >>>>      >>>>>>>  > >> > > that could add noises to the PR
      activities.
      >>>>      >>>>>>>  > >> > >
      >>>> https://travis-ci.org/apache/flink/jobs/545555519
      >>>>      >>>>>>>  > >> > >
      >>>>      >>>>>>>  > >> > > travis-ci for my personal repo
      often failed
      >>>>      with
      >>>>      >>>>>>>  exceeding time
      >>>>      >>>>>>>  > limit
      >>>>      >>>>>>>  > >> > after
      >>>>      >>>>>>>  > >> > > 4+ hours.
      >>>>      >>>>>>>  > >> > > The job exceeded the maximum time
      limit for
      >>>>      jobs, and
      >>>>      >> has
      >>>>      >>>>>>>  been
      >>>>      >>>>>>>  > >> > terminated.
      >>>>      >>>>>>>  > >> > >
      >>>>      >>>>>>>  > >> > > On Mon, Jun 24, 2019 at 4:15 PM
      Bowen Li
      >>>>      >>>>>>>  <bowenl...@gmail.com
      <mailto:bowenl...@gmail.com> <mailto:bowenl...@gmail.com
      <mailto:bowenl...@gmail.com>>
      >>>>      <mailto:bowenl...@gmail.com <mailto:
bowenl...@gmail.com
      <mailto:bowenl...@gmail.com <mailto:bowenl...@gmail.com>>>>
      >>>>      >>>>>>>  > wrote:
      >>>>      >>>>>>>  > >> > >
      >>>>      >>>>>>>  > >> > > >
      >>>> https://travis-ci.org/apache/flink/builds/549681530
      >>>>      >>>>>>>  This build
      >>>>      >>>>>>>  > >> > request
      >>>>      >>>>>>>  > >> > > > has
      >>>>      >>>>>>>  > >> > > > been sitting at **HEAD of the
      queue**
      >>>>      since I first
      >>>>      >> saw
      >>>>      >>>>>>>  it at PST
      >>>>      >>>>>>>  > >> > 10:30am
      >>>>      >>>>>>>  > >> > > > (not sure how long it's been
      there before
      >>>>      10:30am).
      >>>>      >>>>>>>  It's PST
      >>>>      >>>>>>>  > 4:12pm
      >>>>      >>>>>>>  > >> now
      >>>>      >>>>>>>  > >> > > and
      >>>>      >>>>>>>  > >> > > > it hasn't started yet.
      >>>>      >>>>>>>  > >> > > >
      >>>>      >>>>>>>  > >> > > > On Mon, Jun 24, 2019 at 2:48 PM
      Bowen Li
      >>>>      >>>>>>>  <bowenl...@gmail.com
      <mailto:bowenl...@gmail.com> <mailto:bowenl...@gmail.com
      <mailto:bowenl...@gmail.com>>
      >>>>      <mailto:bowenl...@gmail.com <mailto:
bowenl...@gmail.com
      <mailto:bowenl...@gmail.com <mailto:bowenl...@gmail.com>>>>
      >>>>      >>>>>>>  > >> wrote:
      >>>>      >>>>>>>  > >> > > >
      >>>>      >>>>>>>  > >> > > > > Hi devs,
      >>>>      >>>>>>>  > >> > > > >
      >>>>      >>>>>>>  > >> > > > > I've been experiencing the pain
      >>>>      resulting from lack
      >>>>      >>>>>>>  of stable
      >>>>      >>>>>>>  > >> build
      >>>>      >>>>>>>  > >> > > > > capacity on Travis for Flink
      PRs [1].
      >>>>      >> Specifically, I
      >>>>      >>>>>>>  noticed
      >>>>      >>>>>>>  > >> often
      >>>>      >>>>>>>  > >> > > that
      >>>>      >>>>>>>  > >> > > > no
      >>>>      >>>>>>>  > >> > > > > build in the queue is making
any
      >>>>      progress for
      >>>>      >> hours,
      >>>>      >>>> and
      >>>>      >>>>>>>  > suddenly
      >>>>      >>>>>>>  > >> 5
      >>>>      >>>>>>>  > >> > or
      >>>>      >>>>>>>  > >> > > 6
      >>>>      >>>>>>>  > >> > > > > builds kick off all together
      after the
      >>>>      long pause.
      >>>>      >>>>>>>  I'm at PST
      >>>>      >>>>>>>  > >> > (UTC-08)
      >>>>      >>>>>>>  > >> > > > time
      >>>>      >>>>>>>  > >> > > > > zone, and I've seen pause can
      be as
      >>>>      long as 6 hours
      >>>>      >>>>>>>  from PST 9am
      >>>>      >>>>>>>  > >> to
      >>>>      >>>>>>>  > >> > 3pm
      >>>>      >>>>>>>  > >> > > > > (let alone the time needed to
      drain the
      >>>>      queue
      >>>>      >>>>>>>  afterwards).
      >>>>      >>>>>>>  > >> > > > >
      >>>>      >>>>>>>  > >> > > > > I think this has greatly
      impacted our
      >>>>      productivity.
      >>>>      >>>> I've
      >>>>      >>>>>>>  > >> experienced
      >>>>      >>>>>>>  > >> > > that
      >>>>      >>>>>>>  > >> > > > > PRs submitted in the early
      morning of
      >>>>      PST time zone
      >>>>      >>>>>>>  won't finish
      >>>>      >>>>>>>  > >> > their
      >>>>      >>>>>>>  > >> > > > > build until late night of the
      same day.
      >>>>      >>>>>>>  > >> > > > >
      >>>>      >>>>>>>  > >> > > > > So my questions are:
      >>>>      >>>>>>>  > >> > > > >
      >>>>      >>>>>>>  > >> > > > > - Has anyone else experienced
      the same
      >>>>      problem or
      >>>>      >>>>>>>  have similar
      >>>>      >>>>>>>  > >> > > > observation
      >>>>      >>>>>>>  > >> > > > > on TravisCI? (I suspect it
      has things
      >>>>      to do with
      >>>>      >> time
      >>>>      >>>>>>>  zone)
      >>>>      >>>>>>>  > >> > > > >
      >>>>      >>>>>>>  > >> > > > > - What pricing plan of
      TravisCI is
      >>>>      Flink currently
      >>>>      >>>>>>>  using? Is it
      >>>>      >>>>>>>  > >> the
      >>>>      >>>>>>>  > >> > > free
      >>>>      >>>>>>>  > >> > > > > plan for open source
      projects? What
      >>>> are the
      >>>>      >>>>>>>  guaranteed build
      >>>>      >>>>>>>  > >> capacity
      >>>>      >>>>>>>  > >> > > of
      >>>>      >>>>>>>  > >> > > > > the current plan?
      >>>>      >>>>>>>  > >> > > > >
      >>>>      >>>>>>>  > >> > > > > - If the current pricing plan
      (either
      >>>>      free or paid)
      >>>>      >>>>>> can't
      >>>>      >>>>>>>  > provide
      >>>>      >>>>>>>  > >> > > stable
      >>>>      >>>>>>>  > >> > > > > build capacity, can we
      upgrade to a
      >>>>      higher priced
      >>>>      >>>>>>>  plan with
      >>>>      >>>>>>>  > larger
      >>>>      >>>>>>>  > >> > and
      >>>>      >>>>>>>  > >> > > > more
      >>>>      >>>>>>>  > >> > > > > stable build capacity?
      >>>>      >>>>>>>  > >> > > > >
      >>>>      >>>>>>>  > >> > > > > BTW, another factor that
      contribute to
      >>>> the
      >>>>      >>>>>>>  productivity problem
      >>>>      >>>>>>>  > is
      >>>>      >>>>>>>  > >> > that
      >>>>      >>>>>>>  > >> > > > > our build is slow - we run
      full build
      >>>>      for every PR
      >>>>      >>>> and a
      >>>>      >>>>>>>  > >> successful
      >>>>      >>>>>>>  > >> > > full
      >>>>      >>>>>>>  > >> > > > > build takes ~5h. We
      definitely have
      >>>>      more options to
      >>>>      >>>>>>>  solve it,
      >>>>      >>>>>>>  > for
      >>>>      >>>>>>>  > >> > > > instance,
      >>>>      >>>>>>>  > >> > > > > modularize the build graphs
      and reuse
      >>>>      artifacts
      >>>>      >> from
      >>>>      >>>> the
      >>>>      >>>>>>>  > previous
      >>>>      >>>>>>>  > >> > > build.
      >>>>      >>>>>>>  > >> > > > > But I think that can be a big
      effort
      >>>>      which is much
      >>>>      >>>>>>>  harder to
      >>>>      >>>>>>>  > >> > accomplish
      >>>>      >>>>>>>  > >> > > > in
      >>>>      >>>>>>>  > >> > > > > a short period of time and
      may deserve
      >>>>      its own
      >>>>      >>>> separate
      >>>>      >>>>>>>  > >> discussion.
      >>>>      >>>>>>>  > >> > > > >
      >>>>      >>>>>>>  > >> > > > > [1]
      >>>>      >> https://travis-ci.org/apache/flink/pull_requests
      >>>>      >>>>>>>  > >> > > > >
      >>>>      >>>>>>>  > >> > > > >
      >>>>      >>>>>>>  > >> > > >
      >>>>      >>>>>>>  > >> > >
      >>>>      >>>>>>>  > >> >
      >>>>      >>>>>>>  > >>
      >>>>      >>>>>>>  > >
      >>>>      >>>>>>>  >
      >>>>      >>>>>>>
      >>>>      >>>>>>>
      >>>>      >>>>>>>  --
      >>>>      >>>>>>>  Best Regards
      >>>>      >>>>>>>
      >>>>      >>>>>>>  Jeff Zhang
      >>>>      >>>>>>>
      >>>>      >>
      >>>>
      >>>
      >>



Reply via email to