Indeed, sorry for the distraction! Alex
On Tue, Aug 27, 2013 at 11:23 AM, John Griffith <[email protected] > wrote: > > > > On Tue, Aug 27, 2013 at 11:47 AM, Clark Boylan <[email protected]>wrote: > >> On Tue, Aug 27, 2013 at 10:15 AM, Clint Byrum <[email protected]> wrote: >> > Excerpts from John Griffith's message of 2013-08-27 09:42:37 -0700: >> >> On Tue, Aug 27, 2013 at 10:26 AM, Alex Gaynor <[email protected]> >> wrote: >> >> >> >> > I wonder if there's any sort of automation we can apply to this, for >> >> > example having known rechecks have "signatures" and if a failure >> matches >> >> > the signature it auto applies the recheck. >> >> > >> >> >> >> I think we kinda already have that, the recheck list and the bug ID >> >> assigned to it no? Automatically scanning said list and doing the >> recheck >> >> automatically seems like overkill in my opinion. At some point human >> >> though/interaction is required and I don't think it's too much to ask a >> >> technical contributor to simply LOOK at the output from the test runs >> >> against their patches and help out a bit. At the very least if you >> didn't >> >> test your patch yourself and waited for Jenkins to tell you it's >> broken I >> >> would hope that a submitter would at least be motivated to fix their >> own >> >> issue that they introduced. >> >> >> > >> > It is worth thinking about though, because "ask a technical contributor >> > to simply LOOK" is a lot more expensive than "let a script confirm the >> > failure and tack it onto the list for rechecks". >> > >> > Ubuntu has something like this going for all of their users and it is >> > pretty impressive. >> > >> > Apport and/or whoopsie see crashes and look at the >> > backtraces/coredumps/etc and then (with user permission) submit a >> > signature to the backend. It is then analyzed and the result is this: >> > >> > http://errors.ubuntu.com/ >> > >> > Known false positives are shipped along side packages so that they do >> > not produce noise, and known points of pain for debugging are eased by >> > including logs and other things in bug reports when users are running >> > the dev release. This results in a much better metric for what bugs to >> > address first. IIRC update-manager also checks in with a URL that is >> > informed partially by this data about whether or not to update packages, >> > so if there is a high fail rate early on, the server side will basically >> > signal update-manager "don't update right now". >> > >> > I'd love to see our CI system enhanced to do all of the pattern >> > matching to group failures by common patterns, and then when a technical >> > contributor looks at these groups they have tons of data points to _fix_ >> > the problem rather than just spending their precious time identifying >> it. >> > >> > The point of the recheck system, IMHO, isn't to make running rechecks >> > easier, it is to find and fix bugs. >> > >> This is definitely worth thinking about and we had a session on >> dealing with CI logs to do interesting things like update bugs and >> handle rechecks automatically at the Havana summit[0]. Since then we >> have built a logstash + elasticsearch system[1] that filters many of >> our test logs and indexes a subset of what was filtered (typically >> anything with a log level greater than DEBUG). Building this system is >> step one in being able to detect anomalous logs, update bugs, and >> potentially perform automatic rechecks with the appropriate bug. >> Progress has been somewhat slow, but the current setup should be >> mostly stable. If anyone is interested in poking at these tools to do >> interesting automation with them feel free to bug the Infra team. >> >> That said, we won't have something super automagic like that before >> the end of Havana making John's point an important one. If previous >> release feature freezes are any indication we will continue to put >> more pressure on the CI system as we near Havana's feature freeze. Any >> unneeded rechecks or reverifies can potentially slow the whole process >> down for everyone. We should be running as many tests as possible >> locally before pushing to Gerrit (this is as simple as running `tox`) >> and making a best effort to identify the bugs that cause failures when >> performing rechecks or reverifies. >> >> [0] https://etherpad.openstack.org/havana-ci-logging >> [1] http://ci.openstack.org/logstash.html >> >> Thank you, >> Clark >> >> _______________________________________________ >> OpenStack-dev mailing list >> [email protected] >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >> > > The automation ideas are great, no argument there didn't mean to imply > they weren't or discount them. Just don't want the intent of the message > to get lost in all the things we "could" do going forward. > > > _______________________________________________ > OpenStack-dev mailing list > [email protected] > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > -- "I disapprove of what you say, but I will defend to the death your right to say it." -- Evelyn Beatrice Hall (summarizing Voltaire) "The people's good is the highest law." -- Cicero GPG Key fingerprint: 125F 5C67 DFE9 4084
_______________________________________________ OpenStack-dev mailing list [email protected] http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
