Re: > Are they using their own Travis CI pool, or did the switch to an
entirely different CI service?
I reached out to Wes and Krisztián from Apache Arrow PMC. They are
currently moving away from ASF's Travis to their own in-house metal
machines at [1] with custom CI application at [2]. They've se
Are they using their own Travis CI pool, or did the switch to an
entirely different CI service?
If we can just switch to our own Travis pool, just for our project, then
this might be something we can do fairly quickly?
On 03/07/2019 05:55, Bowen Li wrote:
I responded in the INFRA ticket [1]
I responded in the INFRA ticket [1] that I believe they are using a wrong
metric against Flink and the total build time is a completely different
thing than guaranteed build capacity.
My response:
"As mentioned above, since I started to pay attention to Flink's build
queue a few tens of days ago,
As a short-term stopgap, since we can assume this issue to become much
worse in the following days/weeks, we could disable IT cases in PRs and
only run them on master.
On 02/07/2019 12:03, Chesnay Schepler wrote:
People really have to stop thinking that just because something works
for us it i
People really have to stop thinking that just because something works
for us it is also a good solution.
Also, please remember that our builds run for 2h from start to finish,
and not the 14 _minutes_ it takes for zeppelin.
We are dealing with an entirely different scale here, both in terms of
b
By looking at the git history of the Jenkins script, its core part was
finished in March 2017 (and only two minor update in 2017/2018), so it's
been running for over two years now and feels like Zepplin community has
been quite happy with it. @Jeff Zhang can you share your
insights and user experi
So yes, the Jenkins job keeps pulling the state from Travis until it
finishes.
Note sure I'm comfortable with the idea of using Jenkins workers just to
idle for a several hours.
On 29/06/2019 14:56, Jeff Zhang wrote:
Here's what zeppelin community did, we make a python script to check the
bu
Here's what zeppelin community did, we make a python script to check the
build status of pull request.
Here's script:
https://github.com/apache/zeppelin/blob/master/travis_check.py
And this is the script we used in Jenkins build job.
if [ -f "travis_check.py" ]; then
git log -n 1
STATUS=$(cur
Does this imply that a Jenkins job is active as long as the Travis build
runs?
On 26/06/2019 21:28, Bowen Li wrote:
Hi,
@Dawid, I think the "long test running" as I mentioned in the first email,
also as you guys said, belongs to "a big effort which is much harder to
accomplish in a short perio
see https://issues.apache.org/jira/browse/INFRA-18533 for the overall
degradation of Travis capacity.
On 26/06/2019 21:50, Bowen wrote:
just elaborate a bit more on why slow build is ok but no resource is not: Say I
submit a build request at PST 9am, no other requests exist and mine is the
qu
just elaborate a bit more on why slow build is ok but no resource is not: Say I
submit a build request at PST 9am, no other requests exist and mine is the
queue head, currently it means it still cannot get built until 4 or 5pm.
> On Jun 26, 2019, at 12:28, Bowen Li wrote:
>
> Hi,
>
> @Dawid
Hi,
@Dawid, I think the "long test running" as I mentioned in the first email,
also as you guys said, belongs to "a big effort which is much harder to
accomplish in a short period of time and may deserve its own separate
discussion". Thus I didn't include it in what we can do in a foreseeable
shor
Sorry to jump in late, but I think Bowen missed the most important point
from Chesnay's previous message in the summary. The ultimate reason for
all the problems is that the tests take close to 2 hours to run already.
I fully support this claim: "Unless people start caring about test times
before a
Do we know if using "the best" available hardware would improve the build
times?
Imagine we would run the build on machines with plenty of main memory to
mount everything to ramdisk + the latest CPU architecture?
Throwing hardware at the problem could help reduce the time of an
individual build, a
From what I gathered, there's no special sauce that the Zeppelin
project uses which actually integrates a users Travis account into the PR.
They just disabled Travis for PRs. And that's kind of it.
Naturally we can do this (duh) and safe the ASF a fair amount of
resources, but there are downsi
Want to summarize Chesnay's points for everyone reading this thread: 1) the
build resources Flink is currently using belong to ASF INFRA, and 2) we are
waiting on ASF INFRA's response on whether we can donate/sponsor extra
build resources for Flink.
I think it'll be super helpful to pay and secure
On 24/06/2019 23:48, Bowen Li wrote:
- Has anyone else experienced the same problem or have similar observation
on TravisCI? (I suspect it has things to do with time zone)
In Europe we have the same problem.
- What pricing plan of TravisCI is Flink currently using? Is it the free
plan for op
Hi Jeff,
Thanks for sharing the Zeppelin approach. I think it's a good idea to
leverage user's travis account.
In this way, we can have almost unlimited concurrent build jobs and
developers can restart build by themselves (currently only committers can
restart PR's build).
But I'm still not very
Hi Folks,
Zeppelin meet this kind of issue before, we solve it by delegating each
one's PR build to his travis account (Everyone can have 5 free slot for
travis build).
Apache account travis build is only triggered when PR is merged.
Kurt Young 于2019年6月25日周二 上午10:16写道:
> (Forgot to cc George)
(Forgot to cc George)
Best,
Kurt
On Tue, Jun 25, 2019 at 10:16 AM Kurt Young wrote:
> Hi Bowen,
>
> Thanks for bringing this up. We actually have discussed about this, and I
> think Till and George have
> already spend sometime investigating it. I have cced both of them, and
> maybe they can s
Hi Bowen,
Thanks for bringing this up. We actually have discussed about this, and I
think Till and George have
already spend sometime investigating it. I have cced both of them, and
maybe they can share
their findings.
Best,
Kurt
On Tue, Jun 25, 2019 at 10:08 AM Jark Wu wrote:
> Hi Bowen,
>
>
Hi Bowen,
Thanks for bringing this. We also suffered from the long build time.
I agree that we should focus on solving build capacity problem in the
thread.
My observation is there is only one build is running, all the others (other
PRs, master) are pending.
The pricing plan[1] of travis shows it
Hi Steven,
I think you may not read what I wrote. The discussion is about "unstable
build **capacity**", in another word "unstable / lack of build resources",
not "unstable build".
On Mon, Jun 24, 2019 at 4:40 PM Steven Wu wrote:
> long and sometimes unstable build is definitely a pain point.
>
long and sometimes unstable build is definitely a pain point.
I suspect the build failure here in flink-connector-kafka is not related to
my change. but there is no easy re-run the build on travis UI. Google
search showed a trick of close-and-open the PR will trigger rebuild. but
that could add no
https://travis-ci.org/apache/flink/builds/549681530 This build request has
been sitting at **HEAD of the queue** since I first saw it at PST 10:30am
(not sure how long it's been there before 10:30am). It's PST 4:12pm now and
it hasn't started yet.
On Mon, Jun 24, 2019 at 2:48 PM Bowen Li wrote:
Hi devs,
I've been experiencing the pain resulting from lack of stable build
capacity on Travis for Flink PRs [1]. Specifically, I noticed often that no
build in the queue is making any progress for hours, and suddenly 5 or 6
builds kick off all together after the long pause. I'm at PST (UTC-08) t
26 matches
Mail list logo