Hi Paul, Apologies for the delay, I had a very busy beginning of the year. I'm now getting to these in preparation for IETF 122. I have incorporated these comments into the working copy <https://github.com/aarongable/draft-acme-ari/pull/94> (from which I will publish a new version soon), and have responded inline below.
Thanks for your comments! On Mon, Jan 6, 2025 at 12:36 PM Paul Wouters via Datatracker < nore...@ietf.org> wrote: > Paul Wouters has entered the following ballot position for > draft-ietf-acme-ari-07: Discuss > > ---------------------------------------------------------------------- > DISCUSS: > ---------------------------------------------------------------------- > > Thanks for this document. It can be a useful extension but I do have some > issues I would like to discuss / clarify > > Query the renewalInfo resource to get a suggested renewal window. > Select a uniform random time within the suggested window. > If the selected time is in the past, attempt renewal immediately. > > This seems to skew to "now" which would only cause the ACME server more > load > than without this extension (one GET and one actual renewal). Why not let > the > client select a uniform random time between "now" and "end" if "now" > > "start" ? > In general, there are three kinds of ARI response: - Entirely in the past, meaning the client should attempt renewal immediately. - Entirely in the future, meaning the client should schedule renewal within that window. - Overlapping the current instant. If windows of the third kind were treated as "pick any time between now and the end of the window", then actual renewal times within such windows would skew strongly towards the end of the window. That's not a desirable property. Of course, you can't mitigate that fully -- if the first time a client ever checks ARI is already past the beginning of its suggested window, there's nothing you could have done to get them to renew during that missed period. But having the client renew immediately prevents the distribution from skewing *even further* towards the tail end of the window. There's a second advantage: simplicity. The algorithm as described has only one real branch point: whether the randomly-selected timestamp is in the past or future. All windows can be treated the same by the client, no need to special-case the random selection logic based on where start and end timestamps fall relative to local time. > it indicates both the earliest time and a target time. > > It is really not the "earliest time" because an ACME server isn't going to > refuse it? I would rewrite this to just say "it indicates the desired > target > time". > An ACME server absolutely may refuse it! Neither this draft nor the original RFC 8555 places restrictions on the server's ability to rate-limit requests. For example, Let's Encrypt rate-limits requests coming for each distinct origin IP, regardless of the target endpoint or content of those requests. A client retrieving ARI info in a tight loop could easily hit those limits. That said, I'm happy to update the language used here. I've brought it more in line with RFC 7231, saying that the header indicates "not just the minimum but the desired amount of time that the client is asked to wait before issuing another request". > This does bring up a point of concern. Clients who do not implement this > have an advantage on an overloaded server compared to clients who do > implement this. For example, let's say some new industry certification > license says "certifiates MUST always be valid for at least two > more weeks". Wouldn't it make more sense for the server to check the > "urgency" of the client request and when (too) busy, start rejecting to > renew those with plenty of lifetime left? > I'm not sure I follow. You're proposing a hypothetical in which some certificates must be replaced at least two weeks before they would expire, and therefore some renewal requests are more urgent than others? To be honest, I'm not particularly concerned about this situation -- most ACME servers are operating in homogeneous PKIs, more subject to the rules of that PKI's relying parties than of that PKI's subscribers. That said, if an ACME server is aware of this situation, the fix is simple: suggest an earlier default renewal window for such certificates. Then all renewal requests will be pushed towards being equally urgent. > I am also not sure about the argument of revocation for timing. Either > the owner of the cert to be revoked knows this and is already in the > process of replacing the cert/private key, and it wants to get a new > cert issued "now", or it is completely unaware, and most likely whatever > caused the leak of the current cert/private key, would also leak this > renewed one. I don't think an ACME server can help with either cases by > issuing shorter calls to renew. These would also be certificate specific > and I understood this unauthenticated extension to be generic and based > on load, and not on specific individual certificate issues. > There are many causes for revocation beyond key compromise. Perhaps the most impactful, and one of the original impetuses for the creation of this spec, is CA misissuance. If a CA discovers that some large population of its certificates have been misissued and need to be revoked, it needs to communicate that information to its subscribers quickly so that they can get replacement certificates issued. ARI is that communication mechanism. If those subscribers are polling ARI, their clients will automatically replace their certificates without the domain operator even needing to know that something untoward happened. Another example is preventing malicious denial of service. Within ACME, a certificate can be revoked by a subscriber simply by proving that they control the identifiers within that certificate, whether or not they were the subscriber which initially requested its issuance. If a malicious actor gains temporary control of a domain (e.g. via a BGP hijack) they can revoke your certificate for that domain, causing your legitimate customers to see browser interstitial warnings. If your client is checking ARI, it will quickly become aware of this situation and replace the certificate without your intervention. > Clients SHOULD set reasonable limits on the their checking > interval. For > example, values under one minute could be treated as if they were > one > minute, and values over one day could be treated as if they were > one > day. > > This really does violate the "compliant clients MUST" clause :) > I don't follow -- the "conforming clients MUST" clause you comment on below is in regard to when the client must attempt renewal; this statement is in regard to when the client should re-check ARI. > Security Considerations: > > This document specifies that renewalInfo resources MUST be exposed > and accessed via unauthenticated GET requests, a departure from > RFC8555's requirement > > Where does it specify this, other then right here? This specification > should be > outside the Security Considerations section. What I can find is: > > To request the suggested renewal information for a certificate, > the client sends a GET request to a path under the server's > renewalInfo URL. > > Maybe a sentence can be added there that this GET request is > unauthenticated, so > that an implementer does not accidentally send credentials of any kind? > Maybe > even say that a server MUST reject any attempted authorized connections for > renewalInfo to ensure such badly implemented clients cannot prosper ? > Good point, I've added the adjective "unauthenticated" in Section 4.1 where the GET request is first introduced. I've also rephrased to remove the word "MUST" from the Security Considerations section, as you're absolutely right that that section should not include normative requirements. That said, note that within ACME the alternative would not be "GET with some authentication headers set", the alternative would be "RFC 8555 POST-as-GET with JWS-based authentication". This protocol is explicitly eschewing POST-as-GET, not eschewing (e.g.) bearer tokens. > Perhaps also a clarifying sentence can be added along the lines of: > > If an on-path attacker would force ACME clients to postpone renewal > indefinately, a properly implemented client would ignore these > when the > lifetime of its certificate becomes critically low (eg 7 days ?). > > I also feel this belongs more in Section 4.3.2 with some concrete advise to > implementers. > I feel that accounting for on-path attackers is outside the scope of this document, as per RFC 8555 Section 6.1 all requests to an ACME server must be made over HTTPS / TLS. The RFC 8555 Security Considerations section already discusses the security of the client<->server channel, and this document does not meaningfully change the scope of threats to that channel. > As for the last paragraph in the Security Considerations, it seems to > specify > specific server behaviour that belongs in the formal specification instead > of > as security example. If we look at the protocol requirement of the server > to > tell the client "renew now", why not define this by either using a > timestamp of > unix time 0 (eg 1970) or by introducing a third keyword along the "start" > and > "end" in the suggestedWindow property, eg "fetch-now": "recommended" ? > Using > some kind of fake time seems like a poor hack for a protocol, as the text > in > the security considerations already admits to (but then tries to band-aid > the > client) > > Again, I feel this belongs in the base document specification and not in > the > Security Consideration section. > I refer to my paragraph above about the simplicity of the algorithm: the client doesn't have to care about the timestamps contained in the message, it just has to pick a time between them and *then* begin caring about what time was picked. I feel that including special tombstone values (such as the unix epoch) is ripe for introducing parsing errors or other edge-case bugs within clients. Similarly, having two different sets of fields (start/end vs fetch-now) which only make sense when populated separately is asking for confusion: what should a client do if a server accidentally populates all three fields? I believe the simplicity of this protocol is one of its strengths, and that introducing more fields and more logic decision points will make both server and client implementation harder, not easier. > ---------------------------------------------------------------------- > COMMENT: > ---------------------------------------------------------------------- > > Conforming clients MUST attempt renewal > > I find this a bit weasel wording. How about: > > Clients SHOULD attempet renewal > > Clearly, a client can have some overriding local policy concern that > trumps the > ACME servers > Done. > The keyIdentifier field of the certificate's AKI extension has the > hexadecimal bytes > 69:88:5B:6B:87:46:40:41:E1:B3:7B:84:7B:A0:AE:2C:DE:01:C8:D4 as its > ASN.1 Octet String value. The base64url encoding of those bytes is > aYhba4dGQEHhs3uEe6CuLN4ByNQ= > > There seems to be an endian swap in here? Perhaps this text should be > clarified? > The same for the the certificate's Serial Number field in the next > paragraph. > Could you be more specific? Do you mean that you believe there's an endianness swap between the hex bytes and the base64 string? Or between the values in the Appendix A certificate and the hex bytes here? Multiple active implementations (including simply `openssl x509 -noout -text -in appendix_a.pem | grep -A1 "Authority Key Identifier" | tail -n 1 | xxd -r -p | base64`) agree on the value aYhba4dGQEHhs3uEe6CuLN4ByNQ= being the correct base64-encoding for the Appendix A keyIdentifier. > Maybe instead of: > > GET https://example.com/.... > > Use: > > GET https://acme-server.example.com/..... > > Similar for the explanationURL value. > Good point, done. > Clients MUST stop checking RenewalInfo after a certificate is > expired. > > I would stay "MUST skip checking RenewalInfo after a certificate is > expired and > immediately request a renewal." > I didn't want to conflate these two concepts -- maybe the client doesn't want to renew the certificate, and has let it expire on purpose. I've taken this comment and the one below as impetus to rephrase this paragraph, and I think you'll like the result better.m Clients MUST stop checking RenewalInfo after they consider a > certificate to be replaced (for instance, after a new certificate > for the same identifiers has been received and configured). > > I would also avoid the "MUST stop" construct here. Perhaps: > > RenewalInfo MUST NOT be attempted for any certificate that has been > replaced (for instance, after a new certificate for the same > identifiers > has been received and configured) > Good point. I haven't quite taken this suggestion verbatim, but have instead rephrased both of these sentences to: "Clients MUST NOT check a certificate's RenewalInfo after the certificate has expired. Clients MUST NOT check a certificate's RenewalInfo after they consider the certificate to be replaced (for instance, after a new certificate for the same identifiers has been received and configured)." Thanks again, Aaron
_______________________________________________ Acme mailing list -- acme@ietf.org To unsubscribe send an email to acme-le...@ietf.org