Hello, I would like to design a new infrastructure that would replace the DDPO and the PTS, fix many current problems, and enable us to introduce new features to help package maintainers.
I decided to use a DEP for this because I believe it's important to give it some thought before starting to implement, and also because I'd like to get something that will be widely adopted. So far I have only had the feedback of Zack. I'd like to have at least some opinions of other members of the QA team who are involved in maintaining various bits of infrastructure... but anyone is welcome to comment and discuss the idea. The current proposal is at the end in markdown format but you can also read it online in HTML format: http://dep.debian.net/deps/dep2/ This document is obviously a work in progress and I welcome any addition that you would like to make to it. Cheers, PS: I start the discussion within the QA team but once the proposal is better formalized, I'll move the discussion to a wider audience (i.e. -devel or -project). PPS for Zack: I have extended the document since the version that you reviewed. ------ [[!meta title="DEP-2: Debian Package Maintenance Hub"]] Title: Debian Package Maintenance Hub DEP: 2 State: DRAFT Date: 2012-01-13 Drivers: Raphael Hertzog <hert...@debian.org> URL: http://dep.debian.net/deps/dep2 Abstract: Debian maintainers rely on a multitude of services (DDPO, PTS, DDPO-by-mail, BTS, etc), and information sources, in order to do their job. The flow of information varies greatly from case to case. . This proposal is about creating a central infrastructure that would consolidate several of those services and that would standardize the information flow. Rationale --------- This new package maintenance infrastructure is needed: * to fix long standing problems; * to provide a clean base to implement new features: * that will help maintainers do a better job; * that will help packaging teams to organize themselves; * that will help the QA team to ensure that all Debian packages are well maintained. ### Problems to solve #### Maintainer vs Uploader The flow of information is not the same depending on whether you're listed in the Maintainer field (in which case most services mail you directly) or not (in which case you're supposed to subscribe via the PTS or via a dedicated mailing list). But the opposite is true as well, some information is only available via the PTS and many maintainers have to subscribe to the PTS while excluding almost everything just to get the information they want. See also [#507288](http://bugs.debian.org/507288) for some more discussions on this topic. This makes it very painful to change/switch the Maintainer field because people have to update their PTS subscriptions accordingly. #### Duplication of work / inconsistency between the DDPO and the PTS The DDPO and the PTS are completely separate services. This leads to duplication of work when a new information needs to be made available in their respective interface. It can also lead to inconsistencies between both services when bug occurs or when different choices are made. #### Mailing lists as Maintainer We often have mailing lists listed in the Maintainer field and it's not clear who are the real package maintainers and how many of them there are. The Uploaders field is often outdated, and/or is just a representation of who worked last on the package instead of who feels responsible for the package. ### Goals #### Provide a working replacement for DDPO/PTS Since the service aims to merge the DDPO and the PTS, it must be a working replacement of both services and its set of features must englobe the features of the actual services. #### Support alternate notification systems Email is the only official media used to communicate information to Debian package maintainers. If all the relevant mails are going through a central service, it's possible to store those emails and to forward the relevant information by other means (RSS, XMPP, IRC, etc.). Also new maintainers can then have access to some historic information that used to be private for no good reasons. #### Enable new interactions with maintainers The central role of this new "communication infrastructure" makes it possible to design new interactions with maintainers. Instead of being only a source of information, the infrastructure could be used to query package maintainers and/or let them provide supplementary information. This could be used to improve the MIA tracking process. This infrastructure would also be a more natural place to store the "available for Debian work" boolean flag ("vacation") that's currently never used because it's buried in db.debian.org and that it's not practical to update it. This infrastructure could also be used to let maintainers document the responsibilities that they have agreed to endorse, and describe the associated commitments. That way it would be easier to detect packages that cannot be well maintained because the set of maintainers do not cover all the tasks that must be assumed to have a properly maintained package. #### Replacing maintenance mailing lists Packaging teams often separate the mailing list that gets the bug traffic and other notifications from their main discussion mailing list. This new infrastructure should entirely replace the former kind of mailing lists. Anybody receiving notifications and information directed to the package maintainer should get them via this new infrastructure. It allows us to know how many people are notified for a given problem. If nobody is notified, the package is effectively orphaned. A more interesting case to detect is when several persons are being notified but all of them are MIA or marked as not being available for Debian (busy/in vacation). High-level design of the new infrastructure ------------------------------------------- ### Fixing the flow of information In order to cleanly solve the problem of the information flow, and to get rid of the hacks made everywhere to send a copy of the mails to the PTS, packages would be (progressively) modified to indicate “Maintainer: <source>@pkgmaint.debian.org” in their control file. Until all packages have been converted, the PTS would forward copies of the mails to ensure that the new infrastucture can still be used for all packages (even those who have not been updated yet). Using this intermediary address also solves the problem of maintainers who orphan their packages and are still listed as maintainers in many released packages. ### Leveraging UDD At least the PTS has been parsing Sources/Packages files by itself, as well as a bunch of other source of information. But many of the most recent developments have piggy-backed on UDD to retrieve the information needed, leaving to UDD the responsibility of bringing all the information in a single place. This principle should be generalized to avoid duplication of work and to make sure that all the important information are available in UDD. But we must make sure that UDD won't become a bottleneck. Either because we have a local (live?) replicate of the database, or because we have ensured that our usage of UDD is limited to batch tasks that are not on the critical path for all the real-time user requests. ### Using a modern framework for web development DDPO is implemented in PHP. The PTS uses a mix of Perl, Python, XSLT and shell scripts. While both works very well and are reliable, we can do much better by using a modern framework for web development (starting with internationalization of the web interface). ### API for data export If the infrastructure is going to have a central role, there will be requests to extract data out of the system. We should cater for this by providing a public API (over HTTP) allowing to retrieve all the (public) information in some standardized manner. JSON seems to be a good option for data export. It allows other services to reuse information from the DPMH, and it makes it easy for various web services to retrieve the information dynamically via Javascript. ### Native support of packaging teams Any Debian Developer must be able to create a "packaging team" in the system. Each packaging team has a set of packages that it maintains (or keeps an eye on). Anyone can "subscribe" to the team and gets (by default) all correspondance of all packages associated to that team. The team subscription can be tuned (much like the current PTS subscription) to receive only a subset of the usual mails. A direct package subscription would take precedence over a team subscription, thus allowing the user to exclude some packages from its team subscription (or get more info for some specific packages where they are particularly interested). Implementation details ---------------------- Questions: * How do we store emails? For how long? (we store all mails except the BTS mails) * What language and web framework? (buxy's default choice: Python & Django) * How do we authenticate users? And for DD super-powers? * [To be completed] Acronyms -------- * DPMH: Debian Package Maintenance Hub (this project) * DDPO: [Debian Developer's Packages Overview](http://qa.debian.org/developer.php) * PTS: [Package Tracking System](http://packages.qa.debian.org) * BTS: [Bug Tracking System](http://bugs.debian.org) * UDD: [Ultimate Debian Database](http://udd.debian.org) * DD: Debian Developer Changes ------- * 2011-01-13: Initial draft by Raphaël Hertzog. -- Raphaël Hertzog ◈ Debian Developer Pre-order a copy of the Debian Administrator's Handbook and help liberate it: http://debian-handbook.info/liberation/ -- To UNSUBSCRIBE, email to debian-qa-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20120127064940.ga26...@rivendell.home.ouaza.com