Re: [DISCUSS] KAFKA-4345: Run decktape test for each pull request

Raghav Kumar Gautam Tue, 08 Nov 2016 13:48:04 -0800

On Mon, Nov 7, 2016 at 3:20 PM, Ewen Cheslack-Postava <e...@confluent.io>
wrote:


> On Mon, Nov 7, 2016 at 10:30 AM Raghav Kumar Gautam <rag...@apache.org>
> wrote:
>
> > Hi Ewen,
> >
> > Thanks for the feedback. Answers are inlined.
> >
> > On Sun, Nov 6, 2016 at 8:46 PM, Ewen Cheslack-Postava <e...@confluent.io
> >
> > wrote:
> >
> > > Yeah, I'm all for getting these to run more frequently and on lighter
> > > weight infrastructure. (By the way, I also saw the use of docker; I'd
> > > really like to get a "native" docker cluster type into ducktape at some
> > > point so all you have to do is bake the image and then spawn containers
> > on
> > > demand.)
> > >
> > I completely agree, supporting docker integration in ducktape would be
> the
> > ideal solution of the problem.
> >
> >
> > >
> > > A few things. First, it'd be nice to know if we can chain these with
> > normal
> > > PR builds or something like that. Even starting the system tests when
> we
> > > don't know the unit tests will pass seems like it'd be wasteful.
> > >
> > If we do chaining one problem that it will bring is that the turn around
> > time will suffer. It would take 1.5 hrs to run unit tests then another
> 1.5
> > hrs to run decktape tests. Also, don't dev run relevant unit tests before
> > they submit a patch ?
> >
>
> Yeah, I get that. Turnaround time will obviously suffer from serializing
> anything. Here the biggest problem today is that Jenkins builds are not as
> highly parallelized as most users run the tests locally, and the large
> number of integration tests that are baked into the unit tests mean they
> take quite a long time. While running the tests locally has been creeping
> up quite a bit recently, it's still at least < 15min on a relatively recent
> MBP. Ideally we could just get the Jenkins builds to finish faster...
>

I investigated a little bit and it seems that unit tests are not entirely
stable. So, it does not make sense to run them serially as of now.

For e.g.:
https://github.com/apache/kafka/pull/2107 (The unit tests were passing
after 1st commit and failing after second commit. The second commit only
had comment changes.)
https://github.com/apache/kafka/pull/2108
https://github.com/apache/kafka/pull/2099
https://github.com/apache/kafka/pull/2093


> > >
> > > Second, agreed on getting things stable before turning this on across
> the
> > > board.
> >
> > I have done some work for stabilizing the tests. But I need help from
> kafka
> > community to take this further. It will be great if someone can guide me
> on
> > how to do this ? Should we start with a subset of tests that are stable
> and
> > enable others as we make progress ? Who are the people that can I work
> with
> > on this problem ?
> >
>
> It'll probably be a variety of people because it depends on the components
> that are unstable. For example, just among committers, different folks know
> different areas of the code (and especially system tests) to different
> degrees. I can probably help across the board in terms of ducktape/system
> test stuff, but for any individual test you'll probably just want to git
> blame to figure out who might be best to ask for help.
>
> I can take a pass at this patch and see how much makes sense to commit
> immediately. If we don't immediately start getting feedback on failing
> tests and can instead make progress by requesting them manually on only
> some PRs or something like that, then that seems like it could be
> reasonable.
>
> My biggest concern, just taking a quick pass at the changes, is that we're
> doing a lot of renaming of tests just to split them up rather than by
> logical grouping. If we need to do this, it seems much better to add a
> small amount of tooling to ducktape to execute subsets of tests (e.g. split
> across N subsets of the tests). It requires more coordination between
> ducktape and getting this landed, but feels like a much cleaner solution,
> and one that could eventually take advantage of additional information
> (e.g. if it knows avg runtime from previous runs, then it can divide them
> based on that instead of only considering the # of tests).


I agree that the ideal solution would be to add support for this in
ducktape. But since this is going to be a big change, can we do this in a
separate jira ?


> > > Confluent runs these tests nightly on full VMs in AWS and
> > > historically, besides buggy logic in tests, underprovisioned resources
> > tend
> > > to be the biggest source of flakiness in tests.
> > >
> >  Good to know that I am not the only one worrying about this problem :-)
> >
> > Finally, should we be checking w/ infra and/or Travis folks before
> enabling
> > > something this expensive? Are the Storm integration tests of comparable
> > > cost? There are some in-flight patches for parallelizing test runs of
> > > ducktape tests (which also results in better utilization). But even
> with
> > > those changes, the full test run is still quite a few VM-hours per PR
> and
> > > we only expect it to increase.
> > >
> > We can ask infra people about this. But I think this will not be a
> problem.
> > For e.g. Flink <https://travis-ci.org/apache/flink/builds/173852382> is
> > using 11 hrs of computation time for each run. For kafka we are going to
> > start with 6hrs. Also, with the docker setup we can bring up the whole 12
> > node cluster on the laptop and run ducktape tests against it. So, test
> > development cycles will become faster.
> >
>
> Sure, it's just that over time this tends to lead to the current state of
> the Jenkins where it can take many hours before you get any feedback
> because things are so backed up.
>
> -Ewen
>
>
> >
> > With Regards,
> > Raghav.
> >
> >
> >
> > >
> > > -Ewen
> > >
> > > On Thu, Nov 3, 2016 at 11:26 AM, Becket Qin <becket....@gmail.com>
> > wrote:
> > >
> > > > Thanks for the explanation, Raghav.
> > > >
> > > > If the workload is not a concern then it is probably fine to run
> tests
> > > for
> > > > each PR update, although it may not be necessary :)
> > > >
> > > > On Thu, Nov 3, 2016 at 10:40 AM, Raghav Kumar Gautam <
> > rag...@apache.org>
> > > > wrote:
> > > >
> > > > > Hi Becket,
> > > > >
> > > > > The tests would be run each time a PR is created/updated this will
> > look
> > > > > similar to https://github.com/apache/storm/pulls. Ducktape tests
> > take
> > > > > about
> > > > > 7-8 hours to run on my laptop. For travis-ci we can split them in
> > > groups
> > > > > and run them in parallel. This was done in the POC run which took
> 1.5
> > > > hrs.
> > > > > It had 10 splits with 5 jobs running in parallel.
> > > > > https://travis-ci.org/raghavgautam/kafka/builds/171502069
> > > > > For apache projects the limit is 30 jobs in parallel and across all
> > > > > projects, so I expect it to take less time but it also depends on
> the
> > > > > workload at the time.
> > > > > https://blogs.apache.org/infra/entry/apache_gains_
> > additional_travis_ci
> > > > >
> > > > > Thanks,
> > > > > Raghav.
> > > > >
> > > > > On Thu, Nov 3, 2016 at 9:41 AM, Becket Qin <becket....@gmail.com>
> > > wrote:
> > > > >
> > > > > > Thanks Raghav,
> > > > > >
> > > > > > +1 for the idea in general.
> > > > > >
> > > > > > One thing I am wondering is when the tests would be run? Would it
> > be
> > > > run
> > > > > > when we merge a PR or it would be run every time a PR is
> > > > created/updated?
> > > > > > I am not sure how long do the tests in other projects take. For
> > Kafka
> > > > it
> > > > > > may take a few hours to run all the ducktape tests, will that be
> an
> > > > issue
> > > > > > if we run the tests for each updates of the PR?
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Jiangjie (Becket) Qin
> > > > > >
> > > > > > On Thu, Nov 3, 2016 at 8:16 AM, Harsha Chintalapani <
> > ka...@harsha.io
> > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Thanks, Raghav . I am +1 for having this in Kafka. It will help
> > > > > identify
> > > > > > > any potential issues, especially with big patches. Given that
> > we've
> > > > > some
> > > > > > > tests failing due to timing issues
> > > > > > > can we disable the failing tests for now so that we don't get
> any
> > > > false
> > > > > > > negatives?
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Harsha
> > > > > > >
> > > > > > > On Tue, Nov 1, 2016 at 11:47 AM Raghav Kumar Gautam <
> > > > rag...@apache.org
> > > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hi,
> > > > > > > >
> > > > > > > > I want to start a discussion about running ducktape tests for
> > > each
> > > > > pull
> > > > > > > > request. I have been working on KAFKA-4345
> > > > > > > > <https://issues.apache.org/jira/browse/KAFKA-4345> to enable
> > > this
> > > > > > using
> > > > > > > > docker on travis-ci.
> > > > > > > > Pull request: https://github.com/apache/kafka/pull/2064
> > > > > > > > Working POC: https://travis-ci.org/
> raghavgautam/kafka/builds/
> > > > > 171502069
> > > > > > > >
> > > > > > > > In the POC I am able to run 124/149 tests out of which 88
> pass.
> > > The
> > > > > > > failure
> > > > > > > > are mostly timing issues. We can run the same scripts on the
> > > laptop
> > > > > > with
> > > > > > > > which I am able to run 138/149 tests successfully.
> > > > > > > >
> > > > > > > > For this to work we need to enable travis-ci for Kafka. I can
> > > open
> > > > a
> > > > > > > infra
> > > > > > > > bug to request travis-ci for this. Travis-ci is already
> running
> > > > tests
> > > > > > for
> > > > > > > > many apache projects like Storm, Hive, Flume, Thrift etc.
> see:
> > > > > > > > https://travis-ci.org/apache/.
> > > > > > > >
> > > > > > > > Does this sound interesting ? Please comment.
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > > Raghav.
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > Thanks,
> > > Ewen
> > >
> >
>

Re: [DISCUSS] KAFKA-4345: Run decktape test for each pull request

Reply via email to