+1 vino yang <yanghua1...@gmail.com> 于2019年7月4日周四 下午7:55写道:
> +1 > > Dian Fu <dian0511...@gmail.com> 于2019年7月4日周四 下午7:09写道: > > > +1. Thanks Chesnay and Bowen for pushing this forward. > > > > Regards, > > Dian > > > > > 在 2019年7月4日,下午6:28,zhijiang <wangzhijiang...@aliyun.com.INVALID> 写道: > > > > > > +1 and thanks for Chesnay' work on this. > > > > > > Best, > > > Zhijiang > > > > > > ------------------------------------------------------------------ > > > From:Haibo Sun <sunhaib...@163.com> > > > Send Time:2019年7月4日(星期四) 18:21 > > > To:dev <dev@flink.apache.org> > > > Cc:priv...@flink.apache.org <priv...@flink.apache.org> > > > Subject:Re:Re: [VOTE] Migrate to sponsored Travis account > > > > > > +1. Thank Chesnay for pushing this forward. > > > > > > Best, > > > Haibo > > > > > > > > > At 2019-07-04 17:58:28, "Kurt Young" <ykt...@gmail.com> wrote: > > >> +1 and great thanks Chesnay for pushing this. > > >> > > >> Best, > > >> Kurt > > >> > > >> > > >> On Thu, Jul 4, 2019 at 5:44 PM Aljoscha Krettek <aljos...@apache.org> > > wrote: > > >> > > >>> +1 > > >>> > > >>> Aljoscha > > >>> > > >>>> On 4. Jul 2019, at 11:09, Stephan Ewen <se...@apache.org> wrote: > > >>>> > > >>>> +1 to move to a private Travis account. > > >>>> > > >>>> I can confirm that Ververica will sponsor a Travis CI plan that is > > >>>> equivalent or a bit higher than the previous ASF quota (10 > concurrent > > >>> build > > >>>> queues) > > >>>> > > >>>> Best, > > >>>> Stephan > > >>>> > > >>>> On Thu, Jul 4, 2019 at 10:46 AM Chesnay Schepler < > ches...@apache.org> > > >>> wrote: > > >>>> > > >>>>> I've raised a JIRA > > >>>>> <https://issues.apache.org/jira/browse/INFRA-18703>with INFRA to > > >>> inquire > > >>>>> whether it would be possible to switch to a different Travis > account, > > >>>>> and if so what steps would need to be taken. > > >>>>> We need a proper confirmation from INFRA since we are not in full > > >>>>> control of the flink repository (for example, we cannot access the > > >>>>> settings page). > > >>>>> > > >>>>> If this is indeed possible, Ververica is willing sponsor a Travis > > >>>>> account for the Flink project. > > >>>>> This would provide us with more than enough resources than we need. > > >>>>> > > >>>>> Since this makes the project more reliant on resources provided by > > >>>>> external companies I would like to vote on this. > > >>>>> > > >>>>> Please vote on this proposal, as follows: > > >>>>> [ ] +1, Approve the migration to a Ververica-sponsored Travis > > account, > > >>>>> provided that INFRA approves > > >>>>> [ ] -1, Do not approach the migration to a Ververica-sponsored > Travis > > >>>>> account > > >>>>> > > >>>>> The vote will be open for at least 24h, and until we have > > confirmation > > >>>>> from INFRA. The voting period may be shorter than the usual 3 days > > since > > >>>>> our current is effectively not working. > > >>>>> > > >>>>> On 04/07/2019 06:51, Bowen Li wrote: > > >>>>>> Re: > Are they using their own Travis CI pool, or did the switch > to > > an > > >>>>>> entirely different CI service? > > >>>>>> > > >>>>>> I reached out to Wes and Krisztián from Apache Arrow PMC. They are > > >>>>>> currently moving away from ASF's Travis to their own in-house > metal > > >>>>>> machines at [1] with custom CI application at [2]. They've seen > > >>>>>> significant improvement w.r.t both much higher performance and > > >>>>>> basically no resource waiting time, "night-and-day" difference > > quoting > > >>>>>> Wes. > > >>>>>> > > >>>>>> Re: > If we can just switch to our own Travis pool, just for our > > >>>>>> project, then this might be something we can do fairly quickly? > > >>>>>> > > >>>>>> I believe so, according to [3] and [4] > > >>>>>> > > >>>>>> > > >>>>>> [1] https://ci.ursalabs.org/ <https://ci.ursalabs.org/#/> > > >>>>>> [2] https://github.com/ursa-labs/ursabot > > >>>>>> [3] > > >>>>>> > > >>> > > https://docs.travis-ci.com/user/migrate/open-source-repository-migration > > >>>>>> [4] > > >>> https://docs.travis-ci.com/user/migrate/open-source-on-travis-ci-com > > >>>>>> > > >>>>>> > > >>>>>> > > >>>>>> On Wed, Jul 3, 2019 at 12:01 AM Chesnay Schepler < > > ches...@apache.org > > >>>>>> <mailto:ches...@apache.org>> wrote: > > >>>>>> > > >>>>>> Are they using their own Travis CI pool, or did the switch to an > > >>>>>> entirely different CI service? > > >>>>>> > > >>>>>> If we can just switch to our own Travis pool, just for our > > >>>>>> project, then > > >>>>>> this might be something we can do fairly quickly? > > >>>>>> > > >>>>>> On 03/07/2019 05:55, Bowen Li wrote: > > >>>>>>> I responded in the INFRA ticket [1] that I believe they are > > >>>>>> using a wrong > > >>>>>>> metric against Flink and the total build time is a completely > > >>>>>> different > > >>>>>>> thing than guaranteed build capacity. > > >>>>>>> > > >>>>>>> My response: > > >>>>>>> > > >>>>>>> "As mentioned above, since I started to pay attention to Flink's > > >>>>>> build > > >>>>>>> queue a few tens of days ago, I'm in Seattle and I saw no build > > >>>>>> was kicking > > >>>>>>> off in PST daytime in weekdays for Flink. Our teammates in China > > >>>>>> and Europe > > >>>>>>> have also reported similar observations. So we need to evaluate > > >>>>>> how the > > >>>>>>> large total build time came from - if 1) your number and 2) our > > >>>>>>> observations from three locations that cover pretty much a full > > >>>>>> day, are > > >>>>>>> all true, I **guess** one reason can be that - highly likely the > > >>>>>> extra > > >>>>>>> build time came from weekends when other Apache projects may be > > >>>>>> idle and > > >>>>>>> Flink just drains hard its congested queue. > > >>>>>>> > > >>>>>>> Please be aware of that we're not complaining about the lack of > > >>>>>> resources > > >>>>>>> in general, I'm complaining about the lack of **stable, > dedicated** > > >>>>>>> resources. An example for the latter one is, currently even if > > >>>>>> no build is > > >>>>>>> in Flink's queue and I submit a request to be the queue head in > PST > > >>>>>>> morning, my build won't even start in 6-8+h. That is an absurd > > >>>>>> amount of > > >>>>>>> waiting time. > > >>>>>>> > > >>>>>>> That's saying, if ASF INFRA decides to adopt a quota system and > > >>>>>> grants > > >>>>>>> Flink five DEDICATED servers that runs all the time only for > > >>>>>> Flink, that'll > > >>>>>>> be PERFECT and can totally solve our problem now. > > >>>>>>> > > >>>>>>> Please be aware of that we're not complaining about the lack of > > >>>>>> resources > > >>>>>>> in general, I'm complaining about the lack of **stable, > dedicated** > > >>>>>>> resources. An example for the latter one is, currently even if > > >>>>>> no build is > > >>>>>>> in Flink's queue and I submit a request to be the queue head in > PST > > >>>>>>> morning, my build won't even start in 6-8+h. That is an absurd > > >>>>>> amount of > > >>>>>>> waiting time. > > >>>>>>> > > >>>>>>> > > >>>>>>> That's saying, if ASF INFRA decides to adopt a quota system and > > >>>>>> grants > > >>>>>>> Flink five DEDICATED servers that runs all the time only for > > >>>>>> Flink, that'll > > >>>>>>> be PERFECT and can totally solve our problem now. > > >>>>>>> > > >>>>>>> I feel what's missing in the ASF INFRA's Travis resource pool is > > >>>>>> some level > > >>>>>>> of build capacity SLAs and certainty" > > >>>>>>> > > >>>>>>> > > >>>>>>> Again, I believe there are differences in nature of these two > > >>>>>> problems, > > >>>>>>> long build time v.s. lack of dedicated build resource. That's > > >>>>>> saying, > > >>>>>>> shortening build time may relieve the situation, and may not. > > >>>>>> I'm sightly > > >>>>>>> negative on disabling IT cases for PRs, due to the downside is > > >>>>>> that we are > > >>>>>>> at risk of any potential bugs in PR that UTs doesn't catch, and > > >>>>>> may cost a > > >>>>>>> lot more to fix and if it slows others down or even block > > >>>>>> others, but am > > >>>>>>> open to others opinions on it. > > >>>>>>> > > >>>>>>> AFAICT from INFRA ticket[1], donating to ASF INFRA won't be > > >>>>>> feasible to > > >>>>>>> solve our problem since INFRA's pool is fully shared and they > > >>>>>> have no > > >>>>>>> control and finer insights over resource allocation to a > > >>>>>> specific Apache > > >>>>>>> project. As mentioned in [1], Apache Arrow is moving away from > > >>>>>> ASF INFRA > > >>>>>>> Travis pool (they are actually surprised Flink hasn't plan to do > > >>>>>> so). I > > >>>>>>> know that Spark is on its own build infra. If we all agree that > > >>>>>> funding our > > >>>>>>> own build infra, I'd be glad to help investigate any potential > > >>>>>> options > > >>>>>>> after releasing 1.9 since I'm super busy with 1.9 now. > > >>>>>>> > > >>>>>>> [1] https://issues.apache.org/jira/browse/INFRA-18533 > > >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> On Tue, Jul 2, 2019 at 4:46 AM Chesnay Schepler > > >>>>>> <ches...@apache.org <mailto:ches...@apache.org>> wrote: > > >>>>>>> > > >>>>>>>> As a short-term stopgap, since we can assume this issue to > > >>>>>> become much > > >>>>>>>> worse in the following days/weeks, we could disable IT cases in > > >>>>>> PRs and > > >>>>>>>> only run them on master. > > >>>>>>>> > > >>>>>>>> On 02/07/2019 12:03, Chesnay Schepler wrote: > > >>>>>>>>> People really have to stop thinking that just because > > >>>>>> something works > > >>>>>>>>> for us it is also a good solution. > > >>>>>>>>> Also, please remember that our builds run for 2h from start to > > >>>>>> finish, > > >>>>>>>>> and not the 14 _minutes_ it takes for zeppelin. > > >>>>>>>>> We are dealing with an entirely different scale here, both in > > >>>>>> terms of > > >>>>>>>>> build times and number of builds. > > >>>>>>>>> > > >>>>>>>>> In this very thread people have been complaining about long > queue > > >>>>>>>>> times for their builds. Surprise, other Apache projects have > been > > >>>>>>>>> suffering the very same thing due to us not controlling our > build > > >>>>>>>>> times. While switching services (be it Jenkins, CircleCI or > > >>>>>> whatever) > > >>>>>>>>> will possibly work for us (and these options are actually > > >>>>>> attractive, > > >>>>>>>>> like CircleCI's proper support for build artifacts), it will > also > > >>>>>>>>> result in us likely negatively affecting other projects in > > >>>>>> significant > > >>>>>>>>> ways. > > >>>>>>>>> > > >>>>>>>>> Sure, the Jenkins setup has a good user experience for us, at > > >>>>>> the cost > > >>>>>>>>> of blocking Jenkins workers for a _lot_ of time. Right now we > > >>>>>> have 25 > > >>>>>>>>> PR's in our queue; that's possibly 50h we'd consume of Jenkins > > >>>>>>>>> resources, and the European contributors haven't even really > > >>>>>> started yet. > > >>>>>>>>> > > >>>>>>>>> FYI, the latest INFRA response from INFRA-18533: > > >>>>>>>>> > > >>>>>>>>> "Our rough metrics shows that Flink used over 5800 hours of > > >>>>>> build time > > >>>>>>>>> last month. That is equal to EIGHT servers running 24/7 for > > >>>>>> the ENTIRE > > >>>>>>>>> MONTH. EIGHT. nonstop. > > >>>>>>>>> When we discovered this last night, we discussed it some and > > >>>>>> are going > > >>>>>>>>> to tune down Flink to allow only five executors maximum. We > > >>>>> cannot > > >>>>>>>>> allow Flink to consume so much of a Foundation shared > resource." > > >>>>>>>>> > > >>>>>>>>> So yes, we either > > >>>>>>>>> a) have to heavily reduce our CI usage or > > >>>>>>>>> b) fund our own, either maintaining it ourselves or donating > > >>>>>> to Apache. > > >>>>>>>>> > > >>>>>>>>> On 02/07/2019 05:11, Bowen Li wrote: > > >>>>>>>>>> By looking at the git history of the Jenkins script, its core > > >>>>>> part > > >>>>>>>>>> was finished in March 2017 (and only two minor update in > > >>>>>> 2017/2018), > > >>>>>>>>>> so it's been running for over two years now and feels like > > >>>>>> Zepplin > > >>>>>>>>>> community has been quite happy with it. @Jeff Zhang > > >>>>>>>>>> <mailto:zjf...@gmail.com <mailto:zjf...@gmail.com>> can you > > >>>>>> share your insights and user > > >>>>>>>>>> experience with the Jenkins+Travis approach? > > >>>>>>>>>> > > >>>>>>>>>> Things like: > > >>>>>>>>>> > > >>>>>>>>>> - has the approach completely solved the resource capacity > > >>>>>> problem > > >>>>>>>>>> for Zepplin community? is Zepplin community happy with the > > >>>>>> result? > > >>>>>>>>>> - is the whole configuration chain stable (e.g. uptime) > enough? > > >>>>>>>>>> - how often do you need to maintain the Jenkins infra? how > many > > >>>>>>>>>> people are usually involved in maintenance and bug-fixes? > > >>>>>>>>>> > > >>>>>>>>>> The downside of this approach seems mostly to be on the > > >>>>>> maintenance > > >>>>>>>>>> to me - maintain the script and Jenkins infra. > > >>>>>>>>>> > > >>>>>>>>>> ** Having Our Own Travis-CI.com Account ** > > >>>>>>>>>> > > >>>>>>>>>> Another alternative I've been thinking of is to have our own > > >>>>>>>>>> travis-ci.com <http://travis-ci.com> <http://travis-ci.com> > > >>>>>> account with paid dedicated > > >>>>>>>>>> resources. Note travis-ci.org <http://travis-ci.org> > > >>>>>> <http://travis-ci.org> is the free > > >>>>>>>>>> version and travis-ci.com <http://travis-ci.com> > > >>>>>> <http://travis-ci.com> is the commercial > > >>>>>>>>>> version. We currently use a shared resource pool managed by > > >>>>>> ASK INFRA > > >>>>>>>>>> team on travis-ci.org <http://travis-ci.org> > > >>>>>> <http://travis-ci.org>, but we have no control > > >>>>>>>>>> over it - we can't see how it's configured, how much > > >>>>>> resources are > > >>>>>>>>>> available, how resources are allocated among Apache projects, > > >>>>>> etc. > > >>>>>>>>>> The nice thing about having an account on travis-ci.com > > >>>>>> <http://travis-ci.com> > > >>>>>>>>>> <http://travis-ci.com> are: > > >>>>>>>>>> > > >>>>>>>>>> - relatively low cost with much better resource guarantee > > >>>>>> than what > > >>>>>>>>>> we currently have [1]: $249/month with 5 dedicated > concurrency, > > >>>>>>>>>> $489/month with 10 concurrency > > >>>>>>>>>> - low maintenance work compared to using Jenkins > > >>>>>>>>>> - (potentially) no migration cost according to Travis's doc > [2] > > >>>>>>>>>> (pending verification) > > >>>>>>>>>> - full control over the build capacity/configuration compared > to > > >>>>>>>>>> using ASF INFRA's pool > > >>>>>>>>>> > > >>>>>>>>>> I'd be surprised if we as such a vibrant community cannot > > >>>>>> find and > > >>>>>>>>>> fund $249*12=$2988 a year in exchange for a much better > > >>>>> developer > > >>>>>>>>>> experience and much higher productivity. > > >>>>>>>>>> > > >>>>>>>>>> [1] https://travis-ci.com/plans > > >>>>>>>>>> [2] > > >>>>>>>>>> > > >>>>>>>> > > >>>>>> > > >>>>> > > >>> > > https://docs.travis-ci.com/user/migrate/open-source-repository-migration > > >>>>>>>>>> On Sat, Jun 29, 2019 at 8:39 AM Chesnay Schepler > > >>>>>> <ches...@apache.org <mailto:ches...@apache.org> > > >>>>>>>>>> <mailto:ches...@apache.org <mailto:ches...@apache.org>>> > wrote: > > >>>>>>>>>> > > >>>>>>>>>> So yes, the Jenkins job keeps pulling the state from > > >>>>>> Travis until it > > >>>>>>>>>> finishes. > > >>>>>>>>>> > > >>>>>>>>>> Note sure I'm comfortable with the idea of using Jenkins > > >>>>>> workers > > >>>>>>>>>> just to > > >>>>>>>>>> idle for a several hours. > > >>>>>>>>>> > > >>>>>>>>>> On 29/06/2019 14:56, Jeff Zhang wrote: > > >>>>>>>>>>> Here's what zeppelin community did, we make a python > > >>>>>> script to > > >>>>>>>>>> check the > > >>>>>>>>>>> build status of pull request. > > >>>>>>>>>>> Here's script: > > >>>>>>>>>>> > > >>>>>> https://github.com/apache/zeppelin/blob/master/travis_check.py > > >>>>>>>>>>> > > >>>>>>>>>>> And this is the script we used in Jenkins build job. > > >>>>>>>>>>> > > >>>>>>>>>>> if [ -f "travis_check.py" ]; then > > >>>>>>>>>>> git log -n 1 > > >>>>>>>>>>> STATUS=$(curl -s $BUILD_URL | grep -e "GitHub pull > > >>>>>>>>>> request.*from.*" | sed > > >>>>>>>>>>> 's/.*GitHub pull request <a > > >>>>>>>>>>> href=\"\(https[^"]*\).*from[^"]*.\(https[^"]*\).*/\1 > > >>>>>> \2/g') > > >>>>>>>>>>> AUTHOR=$(echo $STATUS | sed 's/.*[/]\(.*\)$/\1/g') > > >>>>>>>>>>> PR=$(echo $STATUS | awk '{print $1}' | sed > > >>>>>>>>>> 's/.*[/]\(.*\)$/\1/g') > > >>>>>>>>>>> #COMMIT=$(git log -n 1 | grep "^Merge:" | awk > > >>>>>> '{print $3}') > > >>>>>>>>>>> #if [ -z $COMMIT ]; then > > >>>>>>>>>>> # COMMIT=$(curl -s > > >>>>>>>>>> https://api.github.com/repos/apache/zeppelin/pulls/$PR > > >>>>>>>>>>> | grep -e "\"label\":" -e "\"ref\":" -e "\"sha\":" | > > >>>>>> tr '\n' ' ' > > >>>>>>>>>> | sed > > >>>>>>>>>>> 's/\(.*sha[^,]*,\)\(.*ref.*\)/\1 = \2/g' | tr = '\n' | > > >>>>>> grep -v > > >>>>>>>>>> "apache:" | > > >>>>>>>>>>> sed 's/.*sha.[^"]*["]\([^"]*\).*/\1/g') > > >>>>>>>>>>> #fi > > >>>>>>>>>>> > > >>>>>>>>>>> # get commit hash from PR > > >>>>>>>>>>> COMMIT=$(curl -s > > >>>>>>>>>> https://api.github.com/repos/apache/zeppelin/pulls/$PR | > > >>>>>>>>>>> grep -e "\"label\":" -e "\"ref\":" -e "\"sha\":" | tr > > >>>>>> '\n' ' ' > > >>>>>>>>>> | sed > > >>>>>>>>>>> 's/\(.*sha[^,]*,\)\(.*ref.*\)/\1 = \2/g' | tr = '\n' | > > >>>>>> grep -v > > >>>>>>>>>> "apache:" | > > >>>>>>>>>>> sed 's/.*sha.[^"]*["]\([^"]*\).*/\1/g') > > >>>>>>>>>>> sleep 30 # sleep few moment to wait travis starts > > >>>>>> the build > > >>>>>>>>>>> RET_CODE=0 > > >>>>>>>>>>> python ./travis_check.py ${AUTHOR} ${COMMIT} || > > >>>>>> RET_CODE=$? > > >>>>>>>>>>> if [ $RET_CODE -eq 2 ]; then # try with repository > > >>>>>> name when > > >>>>>>>>>> travis-ci is > > >>>>>>>>>>> not available in the account > > >>>>>>>>>>> RET_CODE=0 > > >>>>>>>>>>> AUTHOR=$(curl -s > > >>>>>>>>>> https://api.github.com/repos/apache/zeppelin/pulls/$PR > > >>>>>>>>>>> | grep '"full_name":' | grep -v "apache/zeppelin" | sed > > >>>>>>>>>>> 's/.*[:][^"]*["]\([^/]*\).*/\1/g') > > >>>>>>>>>>> python ./travis_check.py ${AUTHOR} ${COMMIT} || > > >>>>>> RET_CODE=$? > > >>>>>>>>>>> fi > > >>>>>>>>>>> > > >>>>>>>>>>> if [ $RET_CODE -eq 2 ]; then # fail with can't find > > >>>>>> build > > >>>>>>>>>> information in > > >>>>>>>>>>> the travis > > >>>>>>>>>>> set +x > > >>>>>>>>>>> echo > > >>>>>> "-----------------------------------------------------" > > >>>>>>>>>>> echo "Looks like travis-ci is not configured for > > >>>>>> your fork." > > >>>>>>>>>>> echo "Please setup by swich on 'zeppelin' > > >>>>>> repository at > > >>>>>>>>>>> https://travis-ci.org/profile and travis-ci." > > >>>>>>>>>>> echo "And then make sure 'Build branch updates' > > >>>>>> option is > > >>>>>>>>>> enabled in > > >>>>>>>>>>> the settings > > >>>>>> https://travis-ci.org/${AUTHOR}/zeppelin/settings > > >>>>>> <https://travis-ci.org/$%7BAUTHOR%7D/zeppelin/settings> > > >>>>>>>>>> <https://travis-ci.org/$%7BAUTHOR%7D/zeppelin/settings>." > > >>>>>>>>>>> echo "" > > >>>>>>>>>>> echo "To trigger CI after setup, you will need > > >>>>>> ammend your > > >>>>>>>>>> last commit > > >>>>>>>>>>> with" > > >>>>>>>>>>> echo "git commit --amend" > > >>>>>>>>>>> echo "git push your-remote HEAD --force" > > >>>>>>>>>>> echo "" > > >>>>>>>>>>> echo "See > > >>>>>>>>>>> > > >>>>>>>>>> > > >>>>>>>> > > >>>>>> > > >>>>> > > >>> > > > http://zeppelin.apache.org/contribution/contributions.html#continuous-integration > > >>>>>>>>>>> ." > > >>>>>>>>>>> fi > > >>>>>>>>>>> > > >>>>>>>>>>> exit $RET_CODE > > >>>>>>>>>>> else > > >>>>>>>>>>> set +x > > >>>>>>>>>>> echo "travis_check.py does not exists" > > >>>>>>>>>>> exit 1 > > >>>>>>>>>>> fi > > >>>>>>>>>>> > > >>>>>>>>>>> Chesnay Schepler <ches...@apache.org > > >>>>>> <mailto:ches...@apache.org> > > >>>>>>>>>> <mailto:ches...@apache.org <mailto:ches...@apache.org>>> > > >>>>>> 于2019年6月29日周六 下午3:17写道: > > >>>>>>>>>>> > > >>>>>>>>>>>> Does this imply that a Jenkins job is active as long > > >>>>>> as the > > >>>>>>>>>> Travis build > > >>>>>>>>>>>> runs? > > >>>>>>>>>>>> > > >>>>>>>>>>>> On 26/06/2019 21:28, Bowen Li wrote: > > >>>>>>>>>>>>> Hi, > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> @Dawid, I think the "long test running" as I > > >>>>>> mentioned in the > > >>>>>>>>>> first > > >>>>>>>>>>>> email, > > >>>>>>>>>>>>> also as you guys said, belongs to "a big effort > > >>>>>> which is much > > >>>>>>>>>> harder to > > >>>>>>>>>>>>> accomplish in a short period of time and may deserve > > >>>>>> its own > > >>>>>>>>>> separate > > >>>>>>>>>>>>> discussion". Thus I didn't include it in what we can > > >>>>>> do in a > > >>>>>>>>>> foreseeable > > >>>>>>>>>>>>> short term. > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> Besides, I don't think that's the ultimate reason > > >>>>>> for lack of > > >>>>>>>>>> build > > >>>>>>>>>>>>> resources. Even if the build is shortened to > > >>>>>> something like > > >>>>>>>>>> 2h, the > > >>>>>>>>>>>>> problems of no build machine works about 6 or more > > >>>>>> hours in > > >>>>>>>>>> PST daytime > > >>>>>>>>>>>>> that I described will still happen, because no > > >>>>>> machine from > > >>>>>>>>>> ASF INFRA's > > >>>>>>>>>>>>> pool is allocated to Flink. As I have paid close > > >>>>>> attention to > > >>>>>>>>>> the build > > >>>>>>>>>>>>> queue in the past few weekdays, it's a pretty clear > > >>>>>> pattern now. > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> **The ultimate root cause** for that is - we don't > > >>>>>> have any > > >>>>>>>>>> **dedicated** > > >>>>>>>>>>>>> build resources that we can stably rely on. I'm > > >>>>>> actually ok to > > >>>>>>>>>> wait for a > > >>>>>>>>>>>>> long time if there are build requests running, it > > >>>>>> means at > > >>>>>>>>>> least we are > > >>>>>>>>>>>>> making progress. But I'm not ok with no build > > >>>>>> resource. A > > >>>>>>>>>> better place I > > >>>>>>>>>>>>> think we should aim at in short term is to always > > >>>>>> have at > > >>>>>>>>>> least a central > > >>>>>>>>>>>>> pool (can be 3 or 5) of machines dedicated to build > > >>>>>> Flink at > > >>>>>>>>>> any time, or > > >>>>>>>>>>>>> maybe use users resources. > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> @Chesnay @Robert I synced with Jeff offline that > > >>>>>> Zeppelin > > >>>>>>>>>> community is > > >>>>>>>>>>>>> using a Jenkins job to automatically build on users' > > >>>>>> travis > > >>>>>>>>>> account and > > >>>>>>>>>>>>> link the result back to github PR. I guess the > > >>>>>> Jenkins job > > >>>>>>>>>> would fetch > > >>>>>>>>>>>>> latest upstream master and build the PR against it. > > >>>>>> Jeff has > > >>>>>>>>>> filed > > >>>>>>>>>>>> tickets > > >>>>>>>>>>>>> to learn and get access to the Jenkins infra. It'll > > >>>>>> better to > > >>>>>>>>>> fully > > >>>>>>>>>>>>> understand it first before judging this approach. > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> I also heard good things about CircleCI, and ASF > > >>>>>> INFRA seems > > >>>>>>>>>> to have a > > >>>>>>>>>>>> pool > > >>>>>>>>>>>>> of build capacity there too. Can be an alternative > > >>>>>> to consider. > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> On Wed, Jun 26, 2019 at 12:44 AM Dawid Wysakowicz < > > >>>>>>>>>>>> dwysakow...@apache.org > > >>>>>> <mailto:dwysakow...@apache.org> <mailto:dwysakow...@apache.org > > >>>>>> <mailto:dwysakow...@apache.org>>> > > >>>>>>>>>>>>> wrote: > > >>>>>>>>>>>>> > > >>>>>>>>>>>>>> Sorry to jump in late, but I think Bowen missed the > > >>>>>> most > > >>>>>>>>>> important point > > >>>>>>>>>>>>>> from Chesnay's previous message in the summary. The > > >>>>>> ultimate > > >>>>>>>>>> reason for > > >>>>>>>>>>>>>> all the problems is that the tests take close to 2 > > >>>>>> hours to > > >>>>>>>>>> run already. > > >>>>>>>>>>>>>> I fully support this claim: "Unless people start > > >>>>>> caring about > > >>>>>>>>>> test times > > >>>>>>>>>>>>>> before adding them, this issue cannot be solved" > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> This is also another reason why using user's Travis > > >>>>>> account > > >>>>>>>>>> won't help. > > >>>>>>>>>>>>>> Every few weeks we reach the user's time limit for > > >>>>>> a single > > >>>>>>>>>> profile. > > >>>>>>>>>>>>>> This makes the user's builds simply fail, until we > > >>>>>> either > > >>>>>>>>>> properly > > >>>>>>>>>>>>>> decrease the time the tests take (which I am not > > >>>>>> sure we ever > > >>>>>>>>>> did) or > > >>>>>>>>>>>>>> postpone the problem by splitting into more > > >>>>>> profiles. (Note > > >>>>>>>>>> that the ASF > > >>>>>>>>>>>>>> Travis account has higher time limits) > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> Best, > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> Dawid > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> On 26/06/2019 09:36, Robert Metzger wrote: > > >>>>>>>>>>>>>>> Do we know if using "the best" available hardware > > >>>>>> would > > >>>>>>>>>> improve the > > >>>>>>>>>>>> build > > >>>>>>>>>>>>>>> times? > > >>>>>>>>>>>>>>> Imagine we would run the build on machines with > > >>>>>> plenty of > > >>>>>>>>>> main memory > > >>>>>>>>>>>> to > > >>>>>>>>>>>>>>> mount everything to ramdisk + the latest CPU > > >>>>>> architecture? > > >>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>> Throwing hardware at the problem could help reduce > > >>>>>> the time > > >>>>>>>>>> of an > > >>>>>>>>>>>>>>> individual build, and using our own infrastructure > > >>>>>> would > > >>>>>>>>>> remove our > > >>>>>>>>>>>>>>> dependency on Apache's Travis account (with the > > >>>>>> obvious > > >>>>>>>>>> downside of > > >>>>>>>>>>>>>> having > > >>>>>>>>>>>>>>> to maintain the infrastructure) > > >>>>>>>>>>>>>>> We could use an open source travis alternative, to > > >>>>>> have a > > >>>>>>>>>> similar > > >>>>>>>>>>>>>>> experience and make the migration easy. > > >>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>> On Wed, Jun 26, 2019 at 9:34 AM Chesnay Schepler > > >>>>>>>>>> <ches...@apache.org <mailto:ches...@apache.org> > > >>>>>> <mailto:ches...@apache.org <mailto:ches...@apache.org>>> > > >>>>>>>>>>>>>> wrote: > > >>>>>>>>>>>>>>>>> From what I gathered, there's no special > > >>>>>> sauce that the > > >>>>>>>>>> Zeppelin > > >>>>>>>>>>>>>>>> project uses which actually integrates a users > > >>>>> Travis > > >>>>>>>>>> account into the > > >>>>>>>>>>>>>> PR. > > >>>>>>>>>>>>>>>> They just disabled Travis for PRs. And that's > > >>>>>> kind of it. > > >>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>> Naturally we can do this (duh) and safe the ASF a > > >>>>>> fair > > >>>>>>>>>> amount of > > >>>>>>>>>>>>>>>> resources, but there are downsides: > > >>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>> The discoverability of the Travis check takes a > > >>>>>> nose-dive. > > >>>>>>>>>> Either we > > >>>>>>>>>>>>>>>> require every contributor to always, an every > > >>>>>> commit, also > > >>>>>>>>>> post a > > >>>>>>>>>>>> Travis > > >>>>>>>>>>>>>>>> build, or we have the reviewer sift through the > > >>>>>>>>>> contributors account > > >>>>>>>>>>>> to > > >>>>>>>>>>>>>>>> find it. > > >>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>> This is rather cumbersome. Additionally, it's > > >>>>>> also not > > >>>>>>>>>> equivalent to > > >>>>>>>>>>>>>>>> having a PR build. > > >>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>> A normal branch build takes a branch as is and > > >>>>>> tests it. A > > >>>>>>>>>> PR build > > >>>>>>>>>>>>>>>> merges the branch into master, and then runs it. > > >>>>>> (Fun fact: > > >>>>>>>>>> This is > > >>>>>>>>>>>> why > > >>>>>>>>>>>>>>>> a PR without merge conflicts is not being run on > > >>>>>> Travis.) > > >>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>> And ultimately, everyone can already make use of > > >>>>> this > > >>>>>>>>>> approach anyway. > > >>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>> On 25/06/2019 08:02, Jark Wu wrote: > > >>>>>>>>>>>>>>>>> Hi Jeff, > > >>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>> Thanks for sharing the Zeppelin approach. I > > >>>>>> think it's a > > >>>>>>>>>> good idea to > > >>>>>>>>>>>>>>>>> leverage user's travis account. > > >>>>>>>>>>>>>>>>> In this way, we can have almost unlimited > > >>>>>> concurrent build > > >>>>>>>>>> jobs and > > >>>>>>>>>>>>>>>>> developers can restart build by themselves > > >>>>>> (currently only > > >>>>>>>>>> committers > > >>>>>>>>>>>>>>>>> can restart PR's build). > > >>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>> But I'm still not very clear how to integrate > > >>>>> user's > > >>>>>>>>>> travis build > > >>>>>>>>>>>> into > > >>>>>>>>>>>>>>>>> the Flink pull request's build automatically. > > >>>>>> Can you > > >>>>>>>>>> explain more in > > >>>>>>>>>>>>>>>>> detail? > > >>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>> Another question: does travis only build > > >>>>>> branches for user > > >>>>>>>>>> account? > > >>>>>>>>>>>>>>>>> My concern is that builds for PRs will rebase > > >>>>> user's > > >>>>>>>>>> commits against > > >>>>>>>>>>>>>>>>> current master branch. > > >>>>>>>>>>>>>>>>> This will help us to find problems before > > >>>>>> merge. Builds > > >>>>>>>>>> for branches > > >>>>>>>>>>>>>>>>> will lose the impact of new commits in master. > > >>>>>>>>>>>>>>>>> How does Zeppelin solve this problem? > > >>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>> Thanks again for sharing the idea. > > >>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>> Regards, > > >>>>>>>>>>>>>>>>> Jark > > >>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>> On Tue, 25 Jun 2019 at 11:01, Jeff Zhang > > >>>>>> <zjf...@gmail.com <mailto:zjf...@gmail.com> > > >>>>>>>>>> <mailto:zjf...@gmail.com <mailto:zjf...@gmail.com>> > > >>>>>>>>>>>>>>>>> <mailto:zjf...@gmail.com > > >>>>>> <mailto:zjf...@gmail.com> <mailto:zjf...@gmail.com > > >>>>>> <mailto:zjf...@gmail.com>>>> wrote: > > >>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>> Hi Folks, > > >>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>> Zeppelin meet this kind of issue before, we solve > > >>>>>>>>>> it by > > >>>>>>>>>>>> delegating > > >>>>>>>>>>>>>>>>> each > > >>>>>>>>>>>>>>>>> one's PR build to his travis account > > >>>>>> (Everyone can > > >>>>>>>>>> have 5 free > > >>>>>>>>>>>>>>>>> slot for > > >>>>>>>>>>>>>>>>> travis build). > > >>>>>>>>>>>>>>>>> Apache account travis build is only triggered when > > >>>>>>>>>> PR is merged. > > >>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>> Kurt Young <ykt...@gmail.com > > >>>>>> <mailto:ykt...@gmail.com> > > >>>>>>>>>> <mailto:ykt...@gmail.com <mailto:ykt...@gmail.com>> > > >>>>>> <mailto:ykt...@gmail.com <mailto:ykt...@gmail.com> > > >>>>>>>>>> <mailto:ykt...@gmail.com <mailto:ykt...@gmail.com>>>> > > >>>>>>>>>>>>>>>>> 于2019年6月25日周二 上午10:16写道: > > >>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>>> (Forgot to cc George) > > >>>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>>> Best, > > >>>>>>>>>>>>>>>>>> Kurt > > >>>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>>> On Tue, Jun 25, 2019 at 10:16 AM Kurt Young > > >>>>>>>>>> <ykt...@gmail.com <mailto:ykt...@gmail.com> > > >>>>>> <mailto:ykt...@gmail.com <mailto:ykt...@gmail.com>> > > >>>>>>>>>>>>>>>>> <mailto:ykt...@gmail.com > > >>>>>> <mailto:ykt...@gmail.com> <mailto:ykt...@gmail.com > > >>>>>> <mailto:ykt...@gmail.com>>>> > > >>>>>>>>>> wrote: > > >>>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>>>> Hi Bowen, > > >>>>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>>>> Thanks for bringing this up. We > > >>>>>> actually have > > >>>>>>>>>> discussed > > >>>>>>>>>>>> about > > >>>>>>>>>>>>>>>>> this, and I > > >>>>>>>>>>>>>>>>>>> think Till and George have > > >>>>>>>>>>>>>>>>>>> already spend sometime investigating > > >>>>>> it. I have > > >>>>>>>>>> cced both of > > >>>>>>>>>>>>>>>>> them, and > > >>>>>>>>>>>>>>>>>>> maybe they can share > > >>>>>>>>>>>>>>>>>>> their findings. > > >>>>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>>>> Best, > > >>>>>>>>>>>>>>>>>>> Kurt > > >>>>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>>>> On Tue, Jun 25, 2019 at 10:08 AM Jark Wu > > >>>>>>>>>> <imj...@gmail.com <mailto:imj...@gmail.com> > > >>>>>> <mailto:imj...@gmail.com <mailto:imj...@gmail.com>> > > >>>>>>>>>>>>>>>>> <mailto:imj...@gmail.com > > >>>>>> <mailto:imj...@gmail.com> <mailto:imj...@gmail.com > > >>>>>> <mailto:imj...@gmail.com>>>> > > >>>>>>>>>> wrote: > > >>>>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>>>>> Hi Bowen, > > >>>>>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>>>>> Thanks for bringing this. We also > > >>>>>> suffered from > > >>>>>>>>>> the long > > >>>>>>>>>>>>>>>>> build time. > > >>>>>>>>>>>>>>>>>>>> I agree that we should focus on > > >>>>>> solving build > > >>>>>>>>>> capacity > > >>>>>>>>>>>>>>>>> problem in the > > >>>>>>>>>>>>>>>>>>>> thread. > > >>>>>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>>>>> My observation is there is only one > > >>>>>> build is > > >>>>>>>>>> running, all > > >>>>>>>>>>>> the > > >>>>>>>>>>>>>>>>> others > > >>>>>>>>>>>>>>>>>>>> (other > > >>>>>>>>>>>>>>>>>>>> PRs, master) are pending. > > >>>>>>>>>>>>>>>>>>>> The pricing plan[1] of travis shows > > >>>>>> it can > > >>>>>>>>>> support > > >>>>>>>>>>>> concurrent > > >>>>>>>>>>>>>>>>> build > > >>>>>>>>>>>>>>>>>> jobs. > > >>>>>>>>>>>>>>>>>>>> But I don't know which plan we are > > >>>>>> using, might > > >>>>>>>>>> be the free > > >>>>>>>>>>>>>>>>> plan for > > >>>>>>>>>>>>>>>>>> open > > >>>>>>>>>>>>>>>>>>>> source. > > >>>>>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>>>>> I cc-ed Chesnay who may have some > > >>>>>> experience on > > >>>>>>>>>> Travis. > > >>>>>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>>>>> Regards, > > >>>>>>>>>>>>>>>>>>>> Jark > > >>>>>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>>>>> [1]: https://travis-ci.com/plans > > >>>>>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>>>>> On Tue, 25 Jun 2019 at 08:11, Bowen Li < > > >>>>>>>>>>>> bowenl...@gmail.com <mailto:bowenl...@gmail.com> > > >>>>>> <mailto:bowenl...@gmail.com <mailto:bowenl...@gmail.com>> > > >>>>>>>>>>>>>>>>> <mailto:bowenl...@gmail.com > > >>>>>> <mailto:bowenl...@gmail.com> > > >>>>>>>>>> <mailto:bowenl...@gmail.com > > >>>>>> <mailto:bowenl...@gmail.com>>>> wrote: > > >>>>>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>>>>>> Hi Steven, > > >>>>>>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>>>>>> I think you may not read what I > > >>>>>> wrote. The > > >>>>>>>>>> discussion is > > >>>>>>>>>>>>>> about > > >>>>>>>>>>>>>>>>>> "unstable > > >>>>>>>>>>>>>>>>>>>>> build **capacity**", in another word > > >>>>>>>>>> "unstable / lack of > > >>>>>>>>>>>>>> build > > >>>>>>>>>>>>>>>>>>>> resources", > > >>>>>>>>>>>>>>>>>>>>> not "unstable build". > > >>>>>>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>>>>>> On Mon, Jun 24, 2019 at 4:40 PM > > >>>>>> Steven Wu > > >>>>>>>>>>>>>>>>> <stevenz...@gmail.com > > >>>>>> <mailto:stevenz...@gmail.com> <mailto:stevenz...@gmail.com > > >>>>>> <mailto:stevenz...@gmail.com>> > > >>>>>>>>>> <mailto:stevenz...@gmail.com > > >>>>>> <mailto:stevenz...@gmail.com> <mailto:stevenz...@gmail.com > > >>>>>> <mailto:stevenz...@gmail.com>>>> > > >>>>>>>>>>>>>>>>>> wrote: > > >>>>>>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>>>>>>> long and sometimes unstable build is > > >>>>>>>>>> definitely a pain > > >>>>>>>>>>>>>>>> point. > > >>>>>>>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>>>>>>> I suspect the build failure here in > > >>>>>>>>>>>> flink-connector-kafka > > >>>>>>>>>>>>>>>>> is not > > >>>>>>>>>>>>>>>>>>>> related > > >>>>>>>>>>>>>>>>>>>>> to > > >>>>>>>>>>>>>>>>>>>>>> my change. but there is no easy > > >>>>>> re-run the > > >>>>>>>>>> build on > > >>>>>>>>>>>>>>>>> travis UI. > > >>>>>>>>>>>>>>>>>> Google > > >>>>>>>>>>>>>>>>>>>>>> search showed a trick of > > >>>>>> close-and-open the > > >>>>>>>>>> PR will > > >>>>>>>>>>>>>>>>> trigger rebuild. > > >>>>>>>>>>>>>>>>>>>> but > > >>>>>>>>>>>>>>>>>>>>>> that could add noises to the PR > > >>>>>> activities. > > >>>>>>>>>>>>>>>>>>>>>> > > >>>>>>>>>> https://travis-ci.org/apache/flink/jobs/545555519 > > >>>>>>>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>>>>>>> travis-ci for my personal repo > > >>>>>> often failed > > >>>>>>>>>> with > > >>>>>>>>>>>>>>>>> exceeding time > > >>>>>>>>>>>>>>>>>> limit > > >>>>>>>>>>>>>>>>>>>>> after > > >>>>>>>>>>>>>>>>>>>>>> 4+ hours. > > >>>>>>>>>>>>>>>>>>>>>> The job exceeded the maximum time > > >>>>>> limit for > > >>>>>>>>>> jobs, and > > >>>>>>>>>>>> has > > >>>>>>>>>>>>>>>>> been > > >>>>>>>>>>>>>>>>>>>>> terminated. > > >>>>>>>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>>>>>>> On Mon, Jun 24, 2019 at 4:15 PM > > >>>>>> Bowen Li > > >>>>>>>>>>>>>>>>> <bowenl...@gmail.com > > >>>>>> <mailto:bowenl...@gmail.com> <mailto:bowenl...@gmail.com > > >>>>>> <mailto:bowenl...@gmail.com>> > > >>>>>>>>>> <mailto:bowenl...@gmail.com <mailto:bowenl...@gmail.com> > > >>>>>> <mailto:bowenl...@gmail.com <mailto:bowenl...@gmail.com>>>> > > >>>>>>>>>>>>>>>>>> wrote: > > >>>>>>>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>>>>>>>> > > >>>>>>>>>> https://travis-ci.org/apache/flink/builds/549681530 > > >>>>>>>>>>>>>>>>> This build > > >>>>>>>>>>>>>>>>>>>>> request > > >>>>>>>>>>>>>>>>>>>>>>> has > > >>>>>>>>>>>>>>>>>>>>>>> been sitting at **HEAD of the > > >>>>>> queue** > > >>>>>>>>>> since I first > > >>>>>>>>>>>> saw > > >>>>>>>>>>>>>>>>> it at PST > > >>>>>>>>>>>>>>>>>>>>> 10:30am > > >>>>>>>>>>>>>>>>>>>>>>> (not sure how long it's been > > >>>>>> there before > > >>>>>>>>>> 10:30am). > > >>>>>>>>>>>>>>>>> It's PST > > >>>>>>>>>>>>>>>>>> 4:12pm > > >>>>>>>>>>>>>>>>>>>> now > > >>>>>>>>>>>>>>>>>>>>>> and > > >>>>>>>>>>>>>>>>>>>>>>> it hasn't started yet. > > >>>>>>>>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>>>>>>>> On Mon, Jun 24, 2019 at 2:48 PM > > >>>>>> Bowen Li > > >>>>>>>>>>>>>>>>> <bowenl...@gmail.com > > >>>>>> <mailto:bowenl...@gmail.com> <mailto:bowenl...@gmail.com > > >>>>>> <mailto:bowenl...@gmail.com>> > > >>>>>>>>>> <mailto:bowenl...@gmail.com <mailto:bowenl...@gmail.com> > > >>>>>> <mailto:bowenl...@gmail.com <mailto:bowenl...@gmail.com>>>> > > >>>>>>>>>>>>>>>>>>>> wrote: > > >>>>>>>>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>>>>>>>>> Hi devs, > > >>>>>>>>>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>>>>>>>>> I've been experiencing the pain > > >>>>>>>>>> resulting from lack > > >>>>>>>>>>>>>>>>> of stable > > >>>>>>>>>>>>>>>>>>>> build > > >>>>>>>>>>>>>>>>>>>>>>>> capacity on Travis for Flink > > >>>>>> PRs [1]. > > >>>>>>>>>>>> Specifically, I > > >>>>>>>>>>>>>>>>> noticed > > >>>>>>>>>>>>>>>>>>>> often > > >>>>>>>>>>>>>>>>>>>>>> that > > >>>>>>>>>>>>>>>>>>>>>>> no > > >>>>>>>>>>>>>>>>>>>>>>>> build in the queue is making any > > >>>>>>>>>> progress for > > >>>>>>>>>>>> hours, > > >>>>>>>>>>>>>> and > > >>>>>>>>>>>>>>>>>> suddenly > > >>>>>>>>>>>>>>>>>>>> 5 > > >>>>>>>>>>>>>>>>>>>>> or > > >>>>>>>>>>>>>>>>>>>>>> 6 > > >>>>>>>>>>>>>>>>>>>>>>>> builds kick off all together > > >>>>>> after the > > >>>>>>>>>> long pause. > > >>>>>>>>>>>>>>>>> I'm at PST > > >>>>>>>>>>>>>>>>>>>>> (UTC-08) > > >>>>>>>>>>>>>>>>>>>>>>> time > > >>>>>>>>>>>>>>>>>>>>>>>> zone, and I've seen pause can > > >>>>>> be as > > >>>>>>>>>> long as 6 hours > > >>>>>>>>>>>>>>>>> from PST 9am > > >>>>>>>>>>>>>>>>>>>> to > > >>>>>>>>>>>>>>>>>>>>> 3pm > > >>>>>>>>>>>>>>>>>>>>>>>> (let alone the time needed to > > >>>>>> drain the > > >>>>>>>>>> queue > > >>>>>>>>>>>>>>>>> afterwards). > > >>>>>>>>>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>>>>>>>>> I think this has greatly > > >>>>>> impacted our > > >>>>>>>>>> productivity. > > >>>>>>>>>>>>>> I've > > >>>>>>>>>>>>>>>>>>>> experienced > > >>>>>>>>>>>>>>>>>>>>>> that > > >>>>>>>>>>>>>>>>>>>>>>>> PRs submitted in the early > > >>>>>> morning of > > >>>>>>>>>> PST time zone > > >>>>>>>>>>>>>>>>> won't finish > > >>>>>>>>>>>>>>>>>>>>> their > > >>>>>>>>>>>>>>>>>>>>>>>> build until late night of the > > >>>>>> same day. > > >>>>>>>>>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>>>>>>>>> So my questions are: > > >>>>>>>>>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>>>>>>>>> - Has anyone else experienced > > >>>>>> the same > > >>>>>>>>>> problem or > > >>>>>>>>>>>>>>>>> have similar > > >>>>>>>>>>>>>>>>>>>>>>> observation > > >>>>>>>>>>>>>>>>>>>>>>>> on TravisCI? (I suspect it > > >>>>>> has things > > >>>>>>>>>> to do with > > >>>>>>>>>>>> time > > >>>>>>>>>>>>>>>>> zone) > > >>>>>>>>>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>>>>>>>>> - What pricing plan of > > >>>>>> TravisCI is > > >>>>>>>>>> Flink currently > > >>>>>>>>>>>>>>>>> using? Is it > > >>>>>>>>>>>>>>>>>>>> the > > >>>>>>>>>>>>>>>>>>>>>> free > > >>>>>>>>>>>>>>>>>>>>>>>> plan for open source > > >>>>>> projects? What > > >>>>>>>>>> are the > > >>>>>>>>>>>>>>>>> guaranteed build > > >>>>>>>>>>>>>>>>>>>> capacity > > >>>>>>>>>>>>>>>>>>>>>> of > > >>>>>>>>>>>>>>>>>>>>>>>> the current plan? > > >>>>>>>>>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>>>>>>>>> - If the current pricing plan > > >>>>>> (either > > >>>>>>>>>> free or paid) > > >>>>>>>>>>>>>>>> can't > > >>>>>>>>>>>>>>>>>> provide > > >>>>>>>>>>>>>>>>>>>>>> stable > > >>>>>>>>>>>>>>>>>>>>>>>> build capacity, can we > > >>>>>> upgrade to a > > >>>>>>>>>> higher priced > > >>>>>>>>>>>>>>>>> plan with > > >>>>>>>>>>>>>>>>>> larger > > >>>>>>>>>>>>>>>>>>>>> and > > >>>>>>>>>>>>>>>>>>>>>>> more > > >>>>>>>>>>>>>>>>>>>>>>>> stable build capacity? > > >>>>>>>>>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>>>>>>>>> BTW, another factor that > > >>>>>> contribute to > > >>>>>>>>>> the > > >>>>>>>>>>>>>>>>> productivity problem > > >>>>>>>>>>>>>>>>>> is > > >>>>>>>>>>>>>>>>>>>>> that > > >>>>>>>>>>>>>>>>>>>>>>>> our build is slow - we run > > >>>>>> full build > > >>>>>>>>>> for every PR > > >>>>>>>>>>>>>> and a > > >>>>>>>>>>>>>>>>>>>> successful > > >>>>>>>>>>>>>>>>>>>>>> full > > >>>>>>>>>>>>>>>>>>>>>>>> build takes ~5h. We > > >>>>>> definitely have > > >>>>>>>>>> more options to > > >>>>>>>>>>>>>>>>> solve it, > > >>>>>>>>>>>>>>>>>> for > > >>>>>>>>>>>>>>>>>>>>>>> instance, > > >>>>>>>>>>>>>>>>>>>>>>>> modularize the build graphs > > >>>>>> and reuse > > >>>>>>>>>> artifacts > > >>>>>>>>>>>> from > > >>>>>>>>>>>>>> the > > >>>>>>>>>>>>>>>>>> previous > > >>>>>>>>>>>>>>>>>>>>>> build. > > >>>>>>>>>>>>>>>>>>>>>>>> But I think that can be a big > > >>>>>> effort > > >>>>>>>>>> which is much > > >>>>>>>>>>>>>>>>> harder to > > >>>>>>>>>>>>>>>>>>>>> accomplish > > >>>>>>>>>>>>>>>>>>>>>>> in > > >>>>>>>>>>>>>>>>>>>>>>>> a short period of time and > > >>>>>> may deserve > > >>>>>>>>>> its own > > >>>>>>>>>>>>>> separate > > >>>>>>>>>>>>>>>>>>>> discussion. > > >>>>>>>>>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>>>>>>>>> [1] > > >>>>>>>>>>>> https://travis-ci.org/apache/flink/pull_requests > > >>>>>>>>>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>> -- > > >>>>>>>>>>>>>>>>> Best Regards > > >>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>> Jeff Zhang > > >>>>>>>>>>>>>>>>> > > >>>>>>>>>>>> > > >>>>>>>>>> > > >>>>>>>>> > > >>>>>>>> > > >>>>>> > > >>>>> > > >>>>> > > >>> > > >>> > > > > > > > > > >