On Sat, 9 Aug 2014 22:15:17 +0400 Igor <lanthrus...@gmail.com> wrote:
> I really don't think data processing unit comes first. The thing is, almost everything for automated build failure reporting is in place on the client side. Build logs are automatically generated and `emerge --info =$CATEGORY/${PF}' is easy, which is what we already require for bug reports. Uploading two files to a web server is trivial viewed from the client side. > Communication protocol is already there - it's HTTP, method POST > HTTP protocol is already with Python - CURL, WGET > A reliable server ready to accept data from portage is all so there - > it's Apache web server. That's a matter of implementation. It doesn't concern the design, which is what you want to be dealing with right now. > What we don't have to start with portage class is just a number of > parameters to submit. Yes, we do. See above. Improvements are suggested and implemented from time to time (see perhaps bug #436294). > Once the list is ready -> add it reporter to portage -> and the > server side data processing unit will appear. Processing what? Example 1: Dr. A. Spammer wants to reach Gentoo developers because he wants to sell them stuff, marry them or give them ONE MILLION DOLLARS. He can spam mailing lists and aliases but they have spam filters or require subscriptions or both. Or he can trivially upload his spam to the build failure log processing server, which will do all the work of distributing those "bug reports". The problem: How do you propose to keep that new service clean? We'd need a spam filter and user authentication, which is a lot harder to do on the client side (than merely uploading to an open server) and requires proper user management on the server side too, which comes with security concerns and manual involvement. Example 2: Mr. I.B. User configured his system with CFLAGS=-fomg-faster and now it generates a ton of build failures. All of these should go to /dev/null, but there we are running an automated service that cannot be taught how to distinguish between genuine bug reports and PEBKAC. The problem: How do you propose to filter out all the junk and promote genuine issues to "bug reports"? Both examples should make clear why we have a bug tracker and not an automated build failure reporting facility - it works rather well and it stops dead a lot of spam and bad reports. It currently takes a handful of people a lot of dedication and time to weed out the crappy bug reports (support requests, misconfiguration, out of date packages, duplicates) from the real ones, but since filing bug reports takes an actual effort by the reporter, we probably don't see that many bad ones. With an automated build failure service in place, we should be prepared to start manually processing an order of magnitude more reports and still decide intelligently which are good and which aren't. Without a really good interface (and a larger team of people dedicated to that job), people will probably start ignoring the entire thing pretty quickly. Creating a better (command line) interface to bugzilla.g.o is probably what we should be working on: extend pybugz (and the server side API) somewhat and create a simple UI that asks some additional questions (about steps to reproduce, a useful Summary/Description and so on) and automatically submits that information. If a user wants to, he can then flip a FEATURES switch which will automatically invoke that bugzilla client, feeding it some preliminary information and a couple of files to attach. A better bug reporting client will not deal with all the problems of bad reports (and probably exacerbate that problem), but it will stop spam and should encourage users to file more bug reports, and would be based on a vetted implementation that has already addressed many of the concerns that your proposal would need to address from the ground up. jer