On Wed, Jun 10, 2015 at 8:04 AM, Bjoern Michaelsen <bjoern.michael...@canonical.com> wrote: > Hi, > As such, here is one idea for infrastructure: > - Create a branch master-tested
We can get all that merely with git-notes I think. iow instead of a separate brnch, just annotate or even maiybe maintain a tag on master to indicate the last 'green master' in the sens you gave. Today the jenkins tinderbox operate like their ancestor: they jump around moving forward.. but not every commit get built. and since they are not all in sync it is hard to garantee that you will find a given commit that has been validated for all conf. _but_ with more hardware comming online, I want to move to a more 'bibisect build model' where _every commit get built. then I can have a matrix job so that we know the overall result of the build of a given commit for all configs.. like we do for gerrit today. and when a given commit will ge found all-green we annotate it as such in git notes (git notes is more flexible than a tag, because that allow us to do the build somewhat out-of-order without pain, for instance to allow 2 or more 'set' of builder to work side by side.. each set move forward so can do incremental, but that means that they can _report_ out of order... so managing a tag would be extra pain. regular fast turn around tinderbox would still be in the mix to have quick alert of a breaker on a given platform TDF is beefing up the infrastructure.. we have 2 nice and beefy 1U that are on purchase order that will become windows builder. We are consolidating owned and lent MAC resources to improve network bandwidth and stability, but I intend to push for the purchase of MacPro. (I got one myself and it perform quite well.. to the point that it is cost-effective compared to mac mini, especially the more recent models) Linux box will have to ramp up too.. but that is usually not that much a problem.. cloud based stuff are fairly competitive for that need. So we can be more reactive with resource capacity for linux. The one thing that everybody can pitch in to help is this: There are 3 kind of failure in ci: - a user induced one (that are the one we are looking for): a change that make something not build or fail test(s) - a infra induced one: the slave bot misbehave for some reason, are fails despite the fact there is nothing wrong really. For these I try to have them repported as 'unstable' rather then 'Fails' as much as possible... - a test auto-induced one: when a test is unstable and produce random failures based on circumstances... the infamous 'heisenbugs' and heisenbug can be a systemic/design problem or can be a real bug that is hard to trigger. either way these are not useful, and in fact harmful in a ci context; because the human nature is 'If you can't reproduce it is not a bug' so the later category of real hard to trigger bug is always labeled 'systemic error' and ignored anyway... and it make people numb to errors... For automated testing, trust is paramount: heisenbug test failure are the enemy, false non-failure is bad but actually less painful Today we have different categories of tests but mostly based on time-to-run versus 'stability' what I would like to see is a 'ci' target in which we had all the tests that _shall_ and _will_ pass unless there is a code bug, no exceptions. Of course time-to-run is important, but that is not the first criteria. time-to-run can be mitigated relatively easily with 'money' but stability and trust in CI cannot. If and when we have the nice-to-have problem of having so many test that it becomes impractical to run them all all-the-time, we'll conceive a 2/3 staged approach where we still get a fast turn around for run of the mills problems, and then deeper testing at a lower frequency. All that being said, none of that matter if the culture does not follow. no amount of CI can make people care.. what set the tone is the core developer group, the rest of us looks around how it is done and emulate the behavior. So we really need the core group of developer to lead by example wrt to taking the state of master seriously... that include for instance the fire and forget 'one-liner that can't possibly break anything' on Friday 5pm.. That include pro-actively revert-fix-resubmit stuff, when a breakage is not obvious... that include use gerrit more, especially for stuff that are not super time sensitive... iow does it really matter if this patch land in 5 hours or tomorrow rather than right now ? There is no hard and fast rule for that that would be flexible enough to accommodate real-world situations... but the only alternative to self-best-effort are pretty much black-and-white all-or-nothing machine enforced rules.. which is really not desirable. PS: just to give an idea about the state of master. I built recently bibisect for windows covering the 5.0 dev period.. iow from the libreoffice-4-4-branch-point to the current head of the libreoffice-5-0 branch that covered 10820 commits of which 2168 where not _buildable_ that is they failed a make build-only with --disable-werror. which is as lenient a built criteria one can have. that is 1 in 5 commit to master did not even compile or link on windows!!! during roughly the November 2014-May-2015 period. For more details, breakage can last quite long: here are the longest consecutive number of broken commits for that period (only >= 50 are listed) out of 163 breakage with more than 1 consecutive commit broken 50 50 55 55 56 57 61 70 94 140 278 bearing in mind once again that these are compile-and-link only build... real ci build (with werror and tests, fair much, much worse) Norbert _______________________________________________ LibreOffice mailing list LibreOffice@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/libreoffice