Ian Jackson writes ("Re: Call for Votes (getaddrinfo)"): > Thus X wins and the resolution between -8<- above has been passed, > overruling the maintainer.
I think we should send our rationales, including dissents, to the bug report. I've collated the opinions that people attached to their votes and the result is below. I've included the assenting views in the order they were emailed, as that seems to make them easiest to make sense of. There was one dissent, which I put at the end. If there were several they too should probably have come in chronological order. It makes most sense to put dissents after assents as they're more likely to be responsive to assents than vice versa, and also to avoid confusing readers with a conflicting view at the start of the text. Normally I think we should just send a collation like the one below straight to the bug report and other interested parties along with the actual decision text, but since this is a new process for dealing with rationales I thought people might want a chance to comment. I definitely think we should probably take the rationale text attached to votes as definitive, and not look through the whole mail thread. If a TC member wishes to amend or augment their rationale after voting they should do so very explicitly. Ian. ---------------------------------------- Ian Jackson, assenting: Introduction 1. We have been asked to rule on the application of RFC3484 section 6 rule 9 by the resolver in glibc. 2. Rule 9 requires a host to sort addresses according to the length of the initial prefix common with the host's own address, when deciding which of a peer's addresses to try in which order. Thus eg, a host 172.18.45.11 would prefer to make a connection to 172.18.45.6 rather than to 172.31.80.8. 3. This has been implemented in glibc upstream by having the DNS resolver sort addresses before passing them to the application via getaddrinfo. Background and history 4. Prior to the publication and implementation of RFC3484, and prior to the introduction of getaddrinfo, most hosts would use an implementation of gethostbyname to find IPv4 addresses to use for a peer, given its hostname. gethostbyname has almost universally returned the addresses in the order supplied by whatever DNS nameserver it was using. 5. In 1993, the then-ubiquitous nameserver implementation BIND was modified to implement a feature known as `DNS Round Robin'. This does not need to be explained in detail, but the intended and actual effect was that clients would be provided addresses (and other records) in a deliberately varying order, so that in the aggregate clients' choice of address to use would be distributed uniformly across the published addresses. 6. Between then and the recent implementation of rule 9 by some hosts, DNS round robin became universally deployed. It has been implemented by other nameservers and has become a de facto standard at least for the interpretation of multiple IPv4 addresses in the global DNS. IPv6 transition 7. The primary use of getaddrinfo is to replace gethostbyname when an application is converted to support IPv6. gethostbyname cannot be sensibly used to support IPv6; while there are other interfaces that can be used instead, the routine practice has been to make certain very consistent sets of changes to applications, which include replacing the use of gethostbyname by getaddrinfo. 8. gethostbyname in current glibc does not implement rule 9. The effect therefore is that whether a particular host follows rule 9 for a particular protocol depends mainly on whether that particular version of the application in question has been updated in the host's operating system to support IPv6. (As well as, of course, whether the operating system's getaddrinfo uses rule 9.) 9. There are no known applications which specifically desire the rule 9 behaviour; we know of no case where an application uses getaddrinfo specifically to get rule 9. 10. There is therefore no rational reason for the difference between the behaviour of gethostbyname and getaddrinfo, other than perhaps implementation convenience. Compatibility and benefits 11. Rule 9 is incompatible with the DNS Round Robin. Prior to rule 9, a system administrator would publish multiple addresses in the intent and expectation of getting roughly equal client load on each address. 12. When Debian's apt changed its behaviour to follow rule 9, it broke ftp.us.debian.org because the load suddenly became very unbalanced. Thus this incompatibility causes actual operational problems. 13. We know of no situations where multiple IPv4 addresses on the global Internet are published with the intent and expectation that rule 9 will be followed by client systems. 14. The nature of the IPv4 address space structure suggests that rule 9 is not in practice useful for IPv4 on the global Internet. History and status of RFC3484 15. RFC3484 and rule 9 forms part of a document set published as part of early IPv6 work. 16. At the time of publication of RFC3484, the intended IPv6 addressing architecture had a significantly different shape. 3484 and rule 9 appear to form part of a set of behaviours which go alongside rapid renumbering, which has now fallen out of favour. 17. There is no evidence that the authors of RFC3484, which is specifically headed as an IPv6 document, considered specifically the behaviour for IPv4 or realised that the specification conflicted with the widely-used DNS Round Robin. 18. RFC3484 was a product of IPv6 (ie networking) working groups, not DNS working groups. Standards 18. The purpose of standards is interoperability. Where following a standard makes us less interoperable we should not follow the standard. Debian is entitled to deviate from standards, including published documents, if we consider it appropriate to do so. 19. We should of course consider carefully before going against a published document. However, when the situation is clear, we should not be overly reluctant to do so. In cases where de jure and de facto standards disagree, we must make a judgement which we prefer based on all of the circumstances. 20. In any case RFC3484 is currently `Proposed Standard', which is the earliest and least mature form of standards track document, which can be expected to have rough edges. Conclusions 21. Rule 9 is not the standard behaviour for IPv4, RFC3484 notwithstanding. Round Robin is the de facto standard behaviour (despite not having been officially standardised), and there can be little justification for making such a radical change at this stage. 22. RFC3484 is therefore in error when it applies rule 9 to IPv4. Not using rule 9 for IPv4 is unquestionably preferable. 23. It appears that RFC3484 is also unhelpful for IPv6. However, since there is no existing de-facto standard for IPv6, this conclusion is arguable. 24. Therefore I would insist on traditional DNS Round Robin, rather than rule 9, for IPv4; but I would only recommend against rule 9 in the case of IPv6. 25. It is clear that the IETF needs to revisit this issue and I would formally recommend to them that they do so. Backporting to current stable 26. In my opinion this change should be backported to current stable. However, this decision does not need to be taken now. We can wait for experience with the change in unstable and testing, which will help convince doubters that there is no compatibility problem. 27. I encourage the submitter and other interested parties to pursue getting this changed in a stable update, and to bring the matter back to the Technical Committee if necessary to achieve this. Responsibility of the Technical Committee to decide 28. One committee member has insisted on the presence of `leave the choice up to the maintainer' on the ballot (option M). My understanding of the meaning of this wording is that if that option wins we refuse to make a decision on the matter and also refuse to deal with it any more. Ie, this option is equivalent to Further Discussion except that the committee will not discuss or vote any more but instead considers the matter closed. 29. I do not consider it appropriate for the committee to decline to issue a ruling. Once a matter has reached us it is for us to make a decision and we should not abdicate that responsibility. If the committee disagrees with the maintainer, but not sufficiently overwhelmingly so as to be able to overrule the maintainer, we should nevertheless issue a ruling clearly stating that we disagree. In this particular case the committee does seem to have a sufficient majority to overrule, if we can only get the mechanics of voting working properly. 30. It has also been suggested that we should not overrule the maintainer unless we consider the bug release-critical. This is an abdication of the responsibility of the committee. In particular, whether or not to overrule the maintainer should depend primarily on how _clear_ it is that the maintainer is wrong, rather than on how _serious_ the consequences are. The constitution's supermajority condition gives effect to the requirement for high confidence in a decision to overrule, and of course individual committee members will want to be sure of their ground in such a case. 31. Therefore I reject the suggestions that we should not decide the matter, or that we should not overrule without concluding that the problem is release-critical. ---------------------------------------- Andreas Barth, assenting: Rationale is mostly known already - Rule 9 doesn't make sense in the IPv4 world (as we have discussed) and breaks current behaviour (for more detailed analysis, see e.g. Ians mail I'm responding to). I'm not argueing about backporting to stable or not right now, as I'm biased a bit on that :) and would like to see this question be handled in the usual SRM accept process. However, using the usual rules, chances are pretty good to get that done once the fix has reached testing. ---------------------------------------- Manoj Srivastava, assenting: As I have mentioned before, I think we should be deciding an issue purely on its merits; and how egregious the error is should not count towards determining what the correct solution is. If our deliberations conclude that a maintainer is incorrect, well, that is what we concluded. Everyone makes mistakes. ---------------------------------------- Anthony Towns, dissenting: Again, if we don't think this bug is severe enough to need to be fixed in stable (and thus qualifies as RC), I don't think we should be overruling the maintainer. If Josip's correct in saying that this is screwing over the Debian apt round-robin hosts, it seems like we should be saying this is RC, but nobody seemed willing to do that when I brought it up earlier. > 4. Prior to the publication and implementation of RFC3484, and prior > to the introduction of getaddrinfo, most hosts would use an > implementation of gethostbyname to find IPv4 addresses to use for > a peer, given its hostname. gethostbyname has almost universally > returned the addresses in the order supplied by whatever DNS > nameserver it was using. getaddrinfo() also almost universally behaved that way until very recently. > IPv6 transition > 7. The primary use of getaddrinfo is to replace gethostbyname when an > application is converted to support IPv6. I would say the primary use of getaddrinfo is to resolve a domain name in a useful way. I don't think replacing gethostbyname is relevant -- if it behaves differently to gethostbyname that's a win if it's more useful and a loss if it's less useful; it's not always a loss merely because it's different. > 9. There are no known applications which specifically desire the > rule 9 behaviour; we know of no case where an application uses > getaddrinfo specifically to get rule 9. RFC3484 specifically allows rule 9 to be overriden if the implementation has a better process, so it's not reasonable for an application to rely on rule 9, afaics. > 10. There is therefore no rational reason for the difference > between the behaviour of gethostbyname and getaddrinfo, other than > perhaps implementation convenience. Consistency between IPv4 and IPv6 behaviours seems a perfectly rational desire, even if it doesn't warrant the cost of changing the application behaviour. > 11. Rule 9 is incompatible with the DNS Round Robin. It's perfectly compatible, it just overrides it. > 12. When Debian's apt changed its behaviour to follow rule 9, > it broke ftp.us.debian.org because the load suddenly became very > unbalanced. Thus this incompatibility causes actual operational > problems. I've seen no evidence that that actually happened. There's some hearsay from Josip ("I'm told that thisbug also broke round-robin DNS functionality for ftp.us.debian.org/http.us.debian.org"), but that's it. > Standards > 18. The purpose of standards is interoperability. Where following a > standard makes us less interoperable we should not follow the > standard. Debian is entitled to deviate from standards, including > published documents, if we consider it appropriate to do so. This doesn't affect interoperability either way, though. It changes the impact of Debian systems on services provided by round-robin hosts (ie, to possibly impact some servers more than others, depending on the distribution of clients, rather than doing equal balancing), and it results in changed expectations of users/developers/admins as to how host resolution on round-robin addresses will work. > 23. It appears that RFC3484 is also unhelpful for IPv6. However, > since there is no existing de-facto standard for IPv6, this > conclusion is arguable. RFC3484 is relied upon by other IPv6 drafts/standards in order to choose the correct class of address for a service (a roaming address versus a static one, a site-local address versus a global one, etc). Some of those can be dealt with by earlier rules (particularly site-local versus global), but that leaves many RFCs that do rely on the rule for IPv6. > Backporting to current stable > 26. In my opinion this change should be backported to current > stable. However, this decision does not need to be taken now. We I think this should be the maintainers' call. The call I think we should be making is whether this is an issue that needs to be corrected in stable, whether by the patch we've seen, or by some other means. If that fix doesn't happen immediately, but waits for further testing in unstable and lenny, that's fine -- it'll be waiting for the next point release in any case. Again, if we don't think this is sufficiently serious to need to be fixed in stable, afaics that means we're ignoring the impact of Debian machines on round-robin services as an important consideration -- including ftp/http.us.d.o and security.d.o. > 28. One committee member has insisted on the presence of `leave the > choice up to the maintainer' on the ballot (option M). My > understanding of the meaning of this wording is that if that > option wins we refuse to make a decision on the matter and also > refuse to deal with it any more. Ie, this option is equivalent to > Further Discussion except that the committee will not discuss or > vote any more but instead considers the matter closed. > 29. I do not consider it appropriate for the committee to decline to > issue a ruling. Once a matter has reached us it is for us to make > a decision and we should not abdicate that responsibility. We should *always* decline to make a decision unless we have clear evidence that it's *necessary* for us to step in. That is to say, it must be *important* that the issue be resolved, and that the maintainer cannot already be resolving it. I can't see any way in which this issue can be important enough to be resolved without it also being important enough to resolve for our current stable release too, which afaics would make it by definition release critical. I don't understand why everyone seems to be passing on declaring it RC or important enough to need fixing in stable. > If the > committee disagrees with the maintainer, but not sufficiently > overwhelmingly so as to be able to overrule the maintainer, If the issue isn't particularly important or the maintainer is already handling it satisfactorily, the ctte shouldn't be spending its time agreeing or disagreeing with the maintainer. I'd say 412976 would be an example of that. > In this particular case the committee does seem > to have a sufficient majority to overrule, if we can only get the > mechanics of voting working properly. Overruling the maintainer should be an absolute last resort, not something we do anytime we see something that 75% of us happen to disagree with. > 30. It has also been suggested that we should not overrule the > maintainer unless we consider the bug release-critical. This is > an abdication of the responsibility of the committee. It's the responsibility of the technical committee to determine which bugs are important enough to warrant our attention to disputes about them. Not determining whether this bug is release-critical, which is to say warrants an update to stable if it applies to packages in stable, is abidicating our responsibility to evaluate the importance of this issue afaics. Maybe you're in effect saying: - the use of rule9 for a function resolving IPv4 addresses is RC in glibc if many applications in Debian use that function for IPv4 addresses - there are/will be many applications using getaddrinfo() for IPv4 addresses in lenny, so this is RC for lenny and must be resolved - there aren't many applications using getaddrinfo() for IPv4 addresses in etch, so this does not need to be resolved (though it'd be nice if it were) I've been assuming there are already sufficient getaddrinfo apps in etch that this is relevant for etch if it's relevant for lenny, but maybe people disagree with that? I could endorse that, though I was under the impression that apt in etch used getaddrinfo for IPv4 resolution (which would seem sufficient Debian apps in etch to me to make the issue equally relevant to etch). > In > particular, whether or not to overrule the maintainer should > depend primarily on how _clear_ it is that the maintainer is > wrong, rather than on how _serious_ the consequences are. I strongly disagree with this. Maintainers get things wrong very frequently; it's their job to fix these things, not the technical committee's. If the issue isn't both important to Debian *and* being mishandled/ignored by the maintainer, it's not in our purview. Establishing a policy or practice where the only bar to us overruling a maintainer is that someone reassign a bug to us and 75% of those of us who can be bothered voting disagree with the maintainer is a terrible idea, IMO. As a consequence, I'll continue voting any attempts to overrule the maintainer that don't (IMO) clearly and consistently establish why the issue is important to Debian below further discussion. And again, the only way I can see this issue being important to Debian is the overall effect of many Debian machines accessing round-robin services and failing to do so in a balanced way. But afaics, if we are using that as our basis, it applies equally to stable and unstable, and warrants being treated as a release critical issue. If we're not willing to take that issue that seriously, I don't see any aspects of this problem that are important enough to warrant tech-ctte resolution. -- Ian Jackson, at home. Local/personal: [EMAIL PROTECTED] [EMAIL PROTECTED] http://www.chiark.greenend.org.uk/~ijackson/ Problems mailing me ? Send [EMAIL PROTECTED] the bounce (bypasses the blocks). -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]