Review of:
ID: https://www.ietf.org/id/draft-gondwana-dkim2-motivation-01.html
DKIM2 Why DKIM needs replacing, and what a replacement would look like
draft-gondwana-dkim2-motivation-01
While the draft might have started with a direction that matches the
title, it is now more of an actual specification than a discussion
document. And it is more than an 'outline design'.
Abstract
This memo provides a rationale and outline design for replacing the
existing email security mechanisms with a new mechanism based around
a more strongly authenticated email delivery pathway, including an
asynchronous return channel.
Outline design?
Note: "replacing email security mechanisms"
Note: "more strongly authenticated email delivery pathway"
btw, the return channel is always asynchronous. Arguably even direct,
single-hop is asynchronous, since the user client is almost certainly
not connected to the process.
1. Background and motivations
In 2007, [RFC4871] (Domain Key Identified Mail / DKIM) was published,
outlining a mechanism for a domain to sign email in a way that
recipients could ensure that the email had come from an entity
possessing the secret key matching a public key published in the DNS
by the source domain. For clarity in this document we call this
established scheme "DKIM1".
*Again, the defined semantic of DKIM is not to ensure who a message
'came from'. This confuses common practice with design. *
It is to identify an entity that takes 'some' responsibility for it.
This is very different.
*And the difference is important because an email can get touched by
a lot of entities, each of which might provide a DKIM signature.*
Like the rest of the email architecture, the design of DKIM is much more
flexible. And this point of confusion sets the stage for an approach to
email that is much more constraining than email is designed to be. The
word Procrustean is apt.
A casual dismissal with a comment like 'things have changed' is the
usual response to this point. Unfortunately it mostly reflects a lack
of effort to consider tradeoffs, or to embrace the historical IETF bias
to retain as much usage flexibility as supportable.
This document has been obsoleted and updated many times since then,
and a large amount of operational experience has been gained using
it. However, it has some known weaknesses with some kinds of email
flow.
If an intermediary alters the email then the original DKIM1 signature
will no longer be verifiable.
This is not a 'design weakness'. DKIM was intended to survive simple,
classic MTA-based relaying, but it never had a goal of surviving
re-postings. It does what it was designed to do.
Re-posting a message can entail arbitrary transformations and a
technology like basic digital signing never had a goal of surviving
that. DKIM was /explicitly NOT/ designed to survive these transformations.
While there is nothing wrong with seeking to provide a mechanism that
does survive, it is a new requirement, not a failure in an existing
mechanism.
So the wording here is finding fault with something that was not a goal.
Using language that implies or claims failures, for things that were
known at the time and were deemed acceptable or out of scope is not a
failure. Especially not after roughly 20 years.
*What the above language actually implies is that this new effort
represents a substantially larger and more ambitious goal. One for
which I believe there is little or no technical history and
certainly no operational history.*
This lack of experience with any mechanism -- other than a basic email
object -- that works through multiple postings/deliveries makes this
aspect of the design goal an open research topic.
And there are no ready proposals that have been developed and discussed
and converged on, ever since the expansion of DMARC use created a
problem for multi-posting/delivery sequences. /That's quite a few years
without landing on a concrete proposal./
*Surviving arbitrary manipulations that might take place, when going
through delivery and re-posting, is another open research topic.*
In 2019, [RFC8617] (Authenticated
Received Chain / ARC) attempted to solve this issue. However, it
does not provide a mechanism for determining whether recipients will
recognise the ARC signatures or trust the veracity of the systems
that add them.
There have not been any email-related mechanisms for pre-determining
receiving system support, other than the MX record's indicating support
of receiving mail. So this might be another ambitious goal, depending
upon the specifics of the requirement, and especially since it needs to
work at scale.
"Trust the veracity" presumably means a receiving system's trusting the
assessment made of an originating DKIM signature which the ARC signer
breaks? The reader should not have to guess.
That was never a goal for ARC, since assessing trust -- other than
whether a signer was authorized to sign -- has always been a value-add
feature for all of these mechanisms, and outside of their scope.
*Trust is an entirely separate and challenging line of reputation,
etc. (research) work.*
A further issue is that bad actors may "replay" bad email that
systems have DKIM signed and erode the reputation of the signer.
Only bad actors who are within the signing domain's own community of users.
*Let me repeat this: DKIM Replay relies on having the message
originate with a user who is authorized to send from the abused
domain's platform. *
So, really, this is a failure of internal regulation and accountability
that is being externalized here.
Theoretically, it might involve a compromised account, within that
services, of course. However that has not been part of the narrative
reporting this problem.
Ad hoc systems have been developed so that systems that have placed a
DKIM signature onto an email may be informed about the quality of the
messages that they are relaying -- so called feedback loops.
However, there are no formal specifications for such schemes and
feedback may be sent where it is not actually required.
*A DKIM feedback loop seems to be a separate work item, a la DMARC.
In general, I'd encourage making the text distinguish different
goals and tasks more carefully. *
As noted in postings by others, there is some community practice
established here, so this is a specific task for which it well might be
possible to document and agree on a particular scheme that scales.
Furthermore, where the origin of a message has been forged and the
final intermediary in a chain finds that it is undeliverable, then
the Delivery Status Notification (DSN) may be sent to an unsuspecting
third-party -- a phenomenon called backscatter.
Given the regular misuse of the term forged, I strongly suggest using a
different term or, at least, defining its use here. For that matter,
specificity about 'the origin' would also help, given that it, too, can
translate into different technical specifics.
Apparently, 'origin' here means the SMTP Mail From. Since it has never
had any requirement about what address is in there -- other than SPF and
DMARC when it is using SPF -- forged is simply a false assertion.
Also, in spite of the name Mail From and the RFC 821(etc.) language
about it, its actual usage has always been as a return address, rather
than as a declaration of authorship. A role assigned to the RFC822/733
From: field.
*This is not a small point. The ability to specify a return (ie,
handling notification) address that differs from the author has been
an important bit of flexibility for email.*
This document solves these and related problems in a holistic way, by
having every hop in a forwarding chain responsible for:
1. verifying the path that messages have taken to get to it,
including by being able to reverse modifications or by asserting
that it trusts the previous hop unconditionally, and that it is
the declared next hop in the chain.
The term forwarding is ambiguous. It is used to mean very different
types of behavior.
*If 'forwarding' means every MTA, this is a massive infrastructure
change to the entire Internet Mail service. *
If it only means Intermediaries/mediators that sit between a delivery
and a new posting, such as mailing lists, then it has the same adoption
barriers as ARC, which has proved challenging. At the least, this is
going to require very careful attention to adoption incentives -- cost
vs. benefit -- to the adopter.
Also, Intermediaries don't do delivery. MDAs do. Again, a technical
document like this needs to use terminology carefully. Also references
to email architectural components. Here, I gather, MTA and MDA were
conflated.
2. declaring (under protection of its own signature) where the
message is being sent to next.
Why?
3. asserting that it will pass control messages (including bounces,
abuse reports and delivery notifications) back to the previous
hop for a reasonable time.
Asserting to whom? Why?
2. Design Goals
1. It is intended that legacy mail systems constructed in the last
century will be able to interoperate with this new specification.
However, more recently developed systems will, after a period of
parallel running, need to be upgraded in order to continue to be
able to authenticate email.
What are the technical differences between the two? Why make the
distinction?
Anyhow, so yes, this is a complete infrastructure replacement.
(btw, the Internet's fairly consistent history with "a period of
parallel running" is that the period is forever, or at least some
decades. cf, RFC2822's attempt to 'obsolete' constructs.
As such, any description of this process as 'transition' is unrealistic
and even misleading.
2. We favor simplicity over obscure functionality.
False, straw dichotomy.
Also not all that helpful in design guidance discussions, in spite of
seeming like it would be. Rather, it adds an awkward and arguably
arrogant tone to the draft.
On the other hand, what might be intended is something like:
2. Favor simple, minimal, essential design features, to facilitate
adoption and use.
(and I note that draft charter 5 has language along these lines.
3. We aim to keep the number of cryptographic operations required
the same or less, for all the most common types of email flow.
Same or less than what? Presumably DKIM?
4. We aim to make all parts of the specification mandatory to
implement because experience shows that interworking is adversely
affected by providing optional functionality.
This misses the difference between essential core, versus useful
enhancement.
Consider email operation or enhancement with no SMTP options.
3. Basic ideas
An email is DKIM2 signed by the originator -- in pretty much the same
way as is done in the existing DKIM1 standard. In practice the vast
majority of mail is signed using relaxed/relaxed methods. DKIM2 will
only allow relaxed/relaxed.
Each "hop" that handles the email adds another, sequentially
numbered, DKIM2 header field.
MTA or Intermediary/mediator, or both?
If MTA, why?
Also, this adds overhead to basic email relaying.
A simple relay will only add a single
header.
ahh. So this really is intending to add this requirement to every MTA,
for every message.
*Why?*
Email service providers will often add two, one on behalf of
the actual originator the other for themselves. A relay that
rewrites email from one domain to another will add two headers to
record the rewriting.
That's not a relay.
Relays don't change the message. And they typically only add a
Received: header field.
So this is really for Intermediaries and not Relays?
Also, adding a DKIM signature isn't onerous, but it's not free, either.
If this is for Intermediaries/mediators, how does this differ from ARC,
and what are the improvements?
If an email is accepted by a server but it is later found that it
cannot be delivered onward, or further analysis of its contents leads
to a determination that it should not be delivered after all, then
the previous hop is informed by means of a Delivery Status
Notification (DSN).
Previous hop? Not the Originator? Why?
If a DSN is received for an email that was
previously relayed, then the DSN is passed back to the system that
delivered the email. Hence, DSNs are returned back along the
"outgoing path" until they reach a system that can take
responsibility for handling the report.
Received by whom?
Relayed by whom?
"delivered the email"? Should this have said 'originated'? If not,
then the statement needs to be clearer about its reasoning.
DSN passed to the /delivering/ system??? Methinks originating is meant.
Why is the 'previous hop' mediating notification of the originator?
Why is this path dependent?
*Why is this extra handling overhead, and demonstrably fragile
handing model being required for every message and every MTA?*
The DKIM2 headers contain the source and destination of the e
Oh? email addresses, or just domains?
Is 'source' the signer or is that a third parameter? If same as source,
why?
So this requires a separate message for each addressee, For every
provider, everywhere, including sites with no DKIM Replay problems. /The
incremental cost of the proposed mechanism is not small./
*Given that DKIM Replay is not a universal problem, dealing with it
should incur extra cost only for those having the problem.*
They may also request "feedback" from later entities as to the
quality of the email. This feedback is sent directly from systems
that choose to honor such requests to any all of the requestors that
the sender deems appropriate.
'quality' is a pretty squishy topic. I believe there is no widespread,
operational history with sharing such information, never-mind any
standard for representing it.
*This, again, seems an open research topic, once the intended
meaning is clarified.
*
Intermediaries that alter emails record their actions (so that later
hops can undo and check signatures). Intermediaries whose
alterations are too complex to be described or reversed will either
have to arrange to be treated as the originator of the message (if
they are near the start of the message's journey) or they will need
to implicitly trusted by any further intermediaries (if they are near
the end of the message's) journey.
There is quite a lot to unpack with this paragraph, and quite a lot of
work needed to satisfy its requirements. And before that, quite a lot of
work to explain the reasoning and basis for it.
"arrange to be treated as the originator" - huh?
"need to implicitly trusted by any further intermediaries" - huh?
"if they are near the end of the message's" - huh?
Please revise, to be more pedagogical for an average (or less than
average) reader. I really did not understand what was meant and the
reader should not have to guess or fill in the blanks.
Intermediaries that duplicate ("explode") emails, record their action
so that any further systems that see multiple copies of the email
will not reject or discard the email as a "replay".
4. How the aims of DKIM2 are achieved
4.1. Simplification & codification
*Issue*:
Every DKIM1 signature specifies an explicit list of which email
header fields have been signed. This leads to inconsistent signing
of headers, and allows a signature to be created in which security-
critical headers are not covered.
Rather, it simplified the specification and permitted operational
flexibility, leaving this particular configuration choice to be
specified later, based on operational experience and independent
community rough consensus. It was not clear what the right set would be
and it was (and remains) possible that different scenarios would benefit
from different sets.
In other words, there was no definitive basis for specifying the exact
set to cover. To my knowledge, there still isn't.
*So the concern, here, is not with something left out of DKIM but
something left out of community discussion and consensus. Worse,
there has not been any public discussion that has demonstrated this
to be a problem.*
One of the problems in the assumption about this item is that DKIM is
supposed to be 'protecting' data integrity of 'security-critical' header
fields. That was never a design goal. The design goal was simply a
means of affixing an accountable domain name.
Data integrity protection is a collateral benefit, given the method used
to affix the d= domain name.
So the movement to 'protecting security-critical fields' is a major
change in goals for the mechanism.
*That said, the change does not require a new 'protocol'. *
*It requires a new BCP declaring a signing practice to cover a
specific set of header fields (and body).*
To prevent bad actors from adding
headers which were not originally present it is common to oversign by
signing null versions of headers that are not present. This
oversigning may be extended to signing two, for example, Subject
header fields because some recipients may not enforce the [RFC5322]
requirement of uniqueness.
*Mitigation:*
DKIM2 will specify a fixed set of headers in accordance with now
well-established best practice (and insist they are unique) so there
will be no need to list what is signed.
"insist they are unique"
Sigh. Another step in making email a rigid, limited service, based on a
snapshot in time, with a specific set of uses.
*Note that this purports to fix a problem, but it isn't one. It's
ugly, but that's different from being a problem. *
*So this is a move to make email a lot more rigid, in order to make
a few folk more comfortable, rather than to fix an actual,
significant problem.*
However, some exotic headers may need to be signed for unusual or
future use-cases. DKIM2 will allow this with an h= field.
ahhh.
*So the proposal is not to eliminate listing of fields to be signed,
but to have a core set of required/default fields and not list them.*
That's quite different from what was being claimed..
It also, again, is solving a non-existent problem.
4.2. Preventing backscatter
*Issue:*
With DKIM1, you can send delayed bounces if the message has come
directly to you and the DKIM signature is DMARC aligned, but
otherwise you need to reject at SMTP transaction time to ensure you
won't be creating backscatter.
What is a 'delayed bounce'?
*DKIM has nothing to do with bounces. Really. Nothing.*
Nothing 'requires' rejection at SMTP transaction time.
Rejection at transaction time merely moves the question of sending a
bounce from the server SMTP, after the session, to the client SMTP,
contemporaneously with the session. Or also after it.
In the overall, global model of multi-handler email, this distinction is
pretty minor. It might be deemed significant only in a highly
constrained, single-hop scenario. But even then, well...
*Mitigation:*
Provided that an email is correctly signed when received, it can be
rejected at a later point in time. The DSN will be sent to the
immediately preceding intermediary. Since the bounce travels back
along the (fully authenticated) incoming path it cannot be sent to an
uninvolved third party.
So, the model is a per-hop signing sequence that is used in reverse to
send bounces back through every /relay/ that handled the message?
*This is quite a fragile operational model.*
Note that source routing has been largely deprecated in modern
networking, yet this scheme relies on it.
If someone thinks there is promising operational history with this
model, please provide citations.
4.3. Improved privacy for forwarders
*Issue:*
If you want to create a privacy preserving forwarding service, you
need to SRS rewrite the email's bounce address so that bounces don't
accidentally leak the real address of the recipient.
This is presented in a fashion that does not adequately describe the
context for the concern.
*Also, I believe it is a wholly new functional requirement, lacking
any history of community concern. *
Feel free to provide documentation to the contrary.
To suggest a more complete foundation and to test whether I understand
what is intended:
When a message is re-posted, such as by an aliasing or mailing list
service, the original author will not necessarily know the address
of the final recipient. If there is desire to keep the original
author from knowing that (new) address, then a bounce message
created during this later processing MUST NOT have the original
RFC5321.Mail-From return address. Otherwise, the original author
might see this later recipient address in the bounce email.
*Mitigation:*
Since the DSN messages always go back up the DKIM2 chain, any hop can
strip off the higher number (i=) records; including the sender and
recipient addresses for them, and create a bounce as if the forwarder
itself was doing the rejection. As asynchronous bounces will be
common in DKIM2, this is indistinguishable to the sender.
*This is quite a lot of unnecessary mechanism, for this purpose,
absent a much larger system-wide goal for hardened security and
privacy functions. *
*To date, there is no specification for such a service. That makes
the requirement, here, frankly arbitrary.*
Replacing an SMTP Mail From with a different address is appropriate for
a number of cases. It is simple and effective. It is also already quite
common.
The current scheme is much more complicated and does not seem to provide
an improvement.
How is i= relevant to this? What does it mean to be a 'higher number'
i= record? The text here appears to rely on some extensive detail that
isn't at all obvious to me. I suspect it won't be to other readers.
(From reading much farther down, I gather it's the sequence number,
except that the nature and use of the sequence number is not explained.
E.g., sequence of what?
4.4. Simplifying error handling for intermediaries
*Issue:*
ESPs and other entities that send email on behalf of others have a
need to know when delivery errors occur.
An ESP and other senders 'on behalf of' are not 'intermediary'. They are
agents of the author. Hence any issues for them are internal to the
author's organization.
Also, having them generate mail that has them, or their customer as
recipients of bounce message is a matter of their choice.
At present this can only be
done by changing the RFC5321 return path so that DSNs will be
delivered to an intermediary rather than original sender.
In what way is this a problem?
Non-
standardised mechanisms such as VERP or SRS may be employed to be
able to pin down the details of the failure.
Since they are originating the message, they are not 'changing' the SMTP
Mail From. They are simply creating it, using their own address. There
is nothing in email that makes this an issue, except for linking SPF and
DMARC.
As for 'pinning down the details of the failure', what does that mean?
*Mitigation:*
In DKIM2 DSNs are passed back along the outgoing path so the ESP will
receive the DSN and, depending on contractual arrangements, may be
able to avoid passing this message any further back along the chain.
This is relying on the complex return handling path when it isn't needed.
*Issue:*
A mailing list wishes to learn when email it has handled cannot be
delivered. At present DSNs (as opposed to next hop delivery
rejections) are often passed to the originator of the email (the
value in the [RFC5321] MAIL FROM) and are invisible to the mailing
list.
*Mitigation:*
Passing bounces back along the outgoing path allows a mailing list to
take responsibility for the event and not bother the person who sent
a message to the list.
Solving a non-problem.
A mailing list is generating a new message posting. It can choose the
SMTP Mail From to be anything it wants, including itself, and this is
common. So this is not a problem and does not require additional mechanism.
4.5. The "mailing list DMARC issue"
*Issue:*
Once an intermediate (for example, a mailing list or alumni
forwarder) makes a change to the header or body of a message, the
hashes covered by the sender's DKIM signature no longer match, and
it's not possible to see whether the message is semantically similar,
or has been completely replaced by a bad actor.
Breakage depends on whether DKIM was covering the portion that got changed.
If preservation of the content is that important, use a
content-protecting mechanism that is more likely to survive a
re-posting. Choices are already available, and have been for decades.
If you don't like the choices, consider defining a mime-based protection
mechanism that is independent of DKIM, while re-using DKIM's design
approach.
My guess is that the concern stated here is rather more specific than is
stated.
*Mitigation:*
DKIM2 will define an algebra for describing how to reverse any
changes to create the prior binary data, by inspecting the diff
between the two versions, recipients will be able to see who injected
bad content.
Mailing lists (or alumni forwarders etc.) that alter the Subject
header field (or other [RFC5322] headers) will record the previous
header field contents. This is easy to undo for checking purposes.
Mailing lists that add text (either to a simple email body or one or
more MIME parts within the body) will record details of the text they
have added. This text can then be removed when checking earlier
signatures.
We expect the "algebra" describing changes to be in a stand-alone
document that need not be finalised on the same timescale as DKIM2
itself.
This is going to cause significant collateral damage, as its use becomes
ingrained and/or widespread:
This is going to define a set of 'acceptable' changes by
intermediaries and will result in calling other changes a 'problem'
or even an 'error'. (cf, calling legitimate From: addresses
'spoofed'.)
*This will further marginalize mail that is entirely legitimate, but
which does not fit into an increasingly Procrustean model.*
4.6. Security gateways
*Issue:*
There are some types of alteration, for example by security gateways,
that may be impractical to describe in a cost-effective manner.
huh?
*Mitigation:*
We would expect that outgoing gateways that may be adding disclaimers
or rewriting internal identifiers would be provided with appropriate
signing keys so that they could be the "first hop" as far as the rest
of the email handling chain is concerned.
Incoming security gateways may be making substantial changes.
Typically they will remove problematic types of attachment and
rewrite URLs to use "interstitials". Since this type of
functionality is generally provided on a contracted basis further
intermediaries will be fully aware of the presence of the security
gateway and can be configured to implicitly trust that it has checked
earlier signatures and found them to be correct. Hence there is not
need to be able to "undo" these changes.
So, ummm, this problem isn't really a problem?
4.7. Addressing DKIM-replay
*Issue:*
Because an email can currently be sent as "Bcc" such that there's no
evidence in the message data of who the recipient is expected to be,
it's possible to take a message that is correctly signed and replay
it millions of times to different destination addresses as if they
had been BCC'd. This message can be resent at any time.
Using BCC does not create this ability. Any recipient -- whether listed
or not -- can replay a message, if they have adequate control of the
re-posting service.
Using BCC hides the address from recipients, but there is little or no
evidence that that affects recipient behavior. (Other than, perhaps,
their sometimes wondering why they got the message.)
*Migitation:*
DKIM2 headers will always have timestamps so that "old" signatures
have no value.
Reports of replay attacks have indicated that replay can and does
happen, well within normal email handling time-frames. So the timestamp
is of no help.
DKIM2 headers specify both "from" and "to" so that most opportunities
to alter a message, re-sign it and replay it at scale will no longer
be possible.
This is the more powerful approach to replay mitigation. As long as one
copy per addressee is acceptable.
Where does the DKIM2 receiver get the addressee info from, to use for
signature evaluation?
Since the "to" address is always encoded in the email,
any email to multiple recipients must be exploded by the sender, and
each copy signed separately with different headers.
If the email is replayed (perhaps through a large system with many
different customers) then if the email does not say that it has been
duplicated then signatures can be assumed to be unique and hence
simple caching (or Bloom filters) will identify replays. If the
email has been duplicated then recipients can assign a reputation to
the entity that did the duplication (along with the expected number
of duplicates that will arrive from that entity) and assess duplicate
signatures on that basis.
How will it know which site did the duplication, given the possibility
of multiple hops after duplication?
If the email is altered before duplication then it is again the case
that this will be apparent to the recipient who can develop a
reputation system for the entity that did the modification and
replay.
4.8. Algorithmic dexterity
The specification will require both RSA and elliptic curve be
implemented. If there is IETF consensus around a "post-quantum"
scheme then that will also be included. Experience with DKIM1 is
that everyone supports RSA keys and EC support is very patchy so we
will emphasize this aspect in bake-offs etc.
There has been a continuing need to be able to add/replace crypto
algorithms. So the dexterity is a legitimate need, but is not new. And
it is already supported in DKIM. And, really, it has nothing to do with
the status of P-Q concerns.
Dexterity will become essential if advances in cryptanalysis cause a
particular type of algorithm to become deprecated. To allow a phased
switch away from such an algorithm we will make provision for more
than one signature to be present in a single DKIM2 header. Systems
capable of checking both signatures will require both to be correct.
If only one signature is correct then email will be rejected with a
clear message -- allowing interworking issues to be easily debugged.
4.9. Reducing crypto-calculations
Experience at large mailbox providers is that incoming messages can
have large numbers of DKIM signatures all of which need to be
checked.
But, do they really /all/ have to be checked? Seriously, why can't
there be some selectivity?
For DKIM2, in the common case where email has not been
altered by earlier hops, it will only be necessary to check the first
DKIM2 signature, the one applied by the previous hop and, if
"feedback" is to be provided, the signatures of any entities that
have requested feedback.
huh? This does not seem at all obvious.
Also, it is not obvious that the current use of DKIM requires checking
all the signatures. Please explain why.
If DKIM-replay is felt to be an issue (and some providers will detect
this by identifying non-unique signatures)
Non-unique signatures? Since I am quite sure this does not mean two
different signatures that produce the same value, what does this mean
and how is it a problem?
then more DKIM2 headers
may need to be processed to establish the veracity of an alleged
forwarding path. Additionally any attempt to do forensics or to
assign reputation to intermediates will require more signatures to be
checked.
What is meant by forwarding path? How is it specified? What does it me
to 'establish the veracity' of it?
As for needing to check signatures by intermediaries, before performing
reputation analysis... yup? What is the problem?
5. DKIM2 header fields
DKIM2 headers will have the following fields
+============+=================================================+
| Field | Explanation |
| identifier | |
+============+=================================================+
| i= | Sequence Number (from 1 to N) |
+------------+-------------------------------------------------+
| t= | Timestamp |
+------------+-------------------------------------------------+
| ds= | Signing key identifier (domain & selector) |
So this appears to conflate selector with domain name being signed? Why?
How is the domain name being signed identified separately?
What is the purpose of the sequence number and how is it used?
+------------+-------------------------------------------------+
| a= | Crypto algorithm(s) used (unless combined with |
| | b= to allow for multiple signatures on the same |
| | email, see discussion of crypto-agility above) |
+------------+-------------------------------------------------+
| b= | Signature over hash value strings (DKIM uses |
| | b=) |
+------------+-------------------------------------------------+
| bh= | Body hash value (see discussion) |
+------------+-------------------------------------------------+
| h= | Extra headers signed by this hop |
+------------+-------------------------------------------------+
| m= | Indicates if mail has been modified or exploded |
huh?
+------------+-------------------------------------------------+
| mf= | RFC5321.mail-from |
+------------+-------------------------------------------------+
| rt= | RFC5321.rcpt-to |
/both/ of these???
btw, Mail From is meant to be a return address, not an author
indication, in spite of the name and in spite of the RFC821, etc.
documentation. Also, none of that ever required a specific value, such
as author address.
+------------+-------------------------------------------------+
| fb= | feedback requested for this email |
+------------+-------------------------------------------------+
| n= | a nonce value (could use for database lookup |
| | for DSN handling) |
+------------+-------------------------------------------------+
Table 1
At the first hop there cannot be any modification,
Why not?
so instead the m=
field is used to indicate a request by the sender that the email
never be modified and/or never be “exploded” to multiple recipients.
This sounds appealing.
However a) it's actual purpose, b) its actual efficacy, and c) its
likely enforcement are worth questioning and explaining carefully, lest
it otherwise seem like a solution in search of a problem.
This might be appropriate for some types of transactional email.
Since it is only a request, intermediaries may, by local policy, not
honor it, but they SHOULD NOT relay mail where the request has not
been honored to third parties.
huh? maybe re-code this with fewer or no negatives?
We will always hash headers in a "relaxed" mode (to use the DKIM1
jargon). For the body we will always use "relaxed" because that is
the usual scheme for DKIM1. Relaxed is more expensive than "simple"
but there have been concerns about expressed about interworking. The
vast majority of DKIM1 email (99%+) uses relaxed/relaxed.
Gondwana, et al. Expires 7 May 2025 [Page 10]
Internet-Draft DKIM2 Motivation November 2024
The hashes of the header and the body can be placed into the DKIM2
header field, and then a signature is applied. Alternatively we
could define a signature over the hashes but not record their values.
The second is neater but the former does allow an unchanged body hash
to just be copied and may assist in diagnosing faults. Note also,
that there is no technical reason for computing two hashes rather
than one, but the presence of an obviously unchanged body hash may
again be useful for fault diagnosis and may assist humans in
reasoning about how and where changes were made to the body.
The nonce value is available for any purpose, but may well be used as
an index into a database to access meta-data about an email that has
been handled in the past. DKIM2 signatures expire after a fixed
period (a week would be appropriate) so that it is not necessary to
hold information for indefinite periods or to handle DSNs for email
that was delivered long ago.
Note that we have not included a version number. Experience from
MIME onwards shows that it is essentially impossible to change
version numbers. If it becomes necessary to change DKIM2 in the sort
of incompatible way that a v=2 / v=3 version number would support, we
recommend using header fields labeled DKIM3 instead.
indeed
For simplicity using single characters before the = would be good,
but we quickly run out of obvious relevant letters. Where there is a
match to DKIM1 fields these have been copied (though this is merely
to assist humans, it's unlikely to affect any code).
ack.
5.1. Value of rt=
Note that is inherent in the DKIM2 design that emails are only ever
sent to one recipient at a time. At present some mail servers will
batch deliveries together if they are going to the same destination
and issue multiple RCPT TO commands in the SMTP protocol
conversation. This is not possible with DKIM2 because each email
must document a single RFC5321 destination in the rt= parameter.
This may seem inefficient but only about 1% of email is currently
delivered using multiple RCPT TO commands.
Consider a bulk-mail packaging option. Something like using the
destination domain name without email address? (Shades of multicasting.)
5.2. Maximum value for i=
There will be an absolutely maximum value of 255 for i=; though
realistically we should be able to bring this value back down to
about 50.
There has been no explanation of what sequence the i= value is
indicating, or why it is needed.
Why is 255 a reasonable maximum?
5.3. Maximum age for t=
For a message in transit, the timestamp MUST be less than one week
ago. For bounces, they MUST be returned to their source within 2
weeks of the timestamp on hop i=1. This requires that as the
destination, you MUST create the bounce within 1 week of receipt.aa
Again, this seems unlikely to be very useful.
5.4. Registry of values for m=
+==========+==============================================+
| Value | Purpose |
+==========+==============================================+
| nomodify | Any hop that adds this requires no |
| | modifications; anybody later hop must either |
| | reject it or agree to pass it on unmodified |
+----------+----------------------------------------------+
| header | This hop has modified the headers; a |
| | separate header lists the algebra to revert |
| | the changes |
+----------+----------------------------------------------+
| body | This hop has modified the body; a separate |
| | header lists the algebra to revert the |
| | changes |
+----------+----------------------------------------------+
| complex | This hop has done something complex and |
| | there is no way to revert it |
+----------+----------------------------------------------+
what will be done with different values?
While the choices seem intuitively reasonable, there is a long history
of definitions seeming that, but lacking concrete specification for
actual use, which turn out to be far less useful than expected.
Table 2
If there is no "m" value then this hop asserts it has not modified
either the body nor any header covered by a previous DKIM2 signature.
5.5. Value for the fb= header
Not present, do not send feedback for this email to the signing
domain
fb=y - this signing domain would like to receive feedback about the
disposition of this email (e.g. percentage reported as spam).
To what address? Why not specify the address here?
6. Checking hashes
For clarity this is written as if there is only one hash and
signature to check, whereas there may in fact be one for the header
and one for the body. Issues with DNS will cause [RFC5321] 4xx
responses to be sent.
6.1. Step 1
Find the latest DKIM2 signature and determine if the email as
received matches that hash AND that you are named as the destination
of the email AND that the mail is coming from the named source. If
not then refuse to accept the email. The previous hop is responsible
for the email (they have either forged it or mangled it -- either way
it is their problem).
If this check passes then other checks can either be done at delivery
time or the mail can be accepted and if later rejected there is a
valid path back to the sender over which a DSN can be sent.
If it is decided, by local policy, to accept an email where the hash
fails (or you are not the documented recipient) then DSN messages
back along the chain MUST NOT be sent.
6.2. Step 2
Find the first DKIM2 header (numbered 1) and determine if the email
as received has the same hash as recorded there. If so, you know the
email has not been altered on its way to you. The path it has
followed is most likely irrelevant for deciding what to do with it.
6.3. Dealing with modifications.
This is not dealing with the hash. It is a separate topic and should be
a different section.
Find the highest numbered DKIM2 header that reports a modification.
Undo the modification and repeat. When all modifications have been
done then there should be a match with the original signature (at
hop1). If not then the email has been altered (in an undocumented
manner) on its way to you and it SHOULD be rejected.
Do not embed policy directives in the middle of algorithm specifications.
And the framing of such a directive is not to tell them what they should
or should not do, but what the signers would like done. The difference
has to do with authority. The signer has no authority over the site
interpreting the signature. And the choice of policy to apply after
evaluation is /always/ a local matter. That is, in terms of protocol
specification, the actual policy applied is outside the scope of this
document.
Note that it is not necessary to check the signature on a DKIM2
header that reports a modification. Undoing the modification and
discovering that the message can now be authenticated is sufficient.
The second sentence is too cryptic. Perhaps:
Rather, reversing the modification it reports should make it
possible to check the original DKIM2 signature and validate that.
This makes evaluation of a DKIM signature by the reporting site
unnecessary.
Over time a reputation can be developed for a intermediary which is
making modifications and given sufficient trust then the "undo" step
could be skipped.
*I suspect this document has nothing to do with the nature and
details of reputation assessment. *
It is attempting to provide some reliable and accurate information that
can be used by such an engine, but that's different from this document's
offering comments about how such an engine works.
Note that the signature of the DKIM2 header that
reports the modification would need to be checked to ensure
reputation accrued to the correct entity.
If the modification is substantial (eg URLs rewritten, MIME parts
removed) and it cannot be undone then the receiver (who may not be
the immediate next hop) MUST trust the system doing the modification.
If it does not then the mail SHOULD be rejected.
It will be noted that some modifications can totally change the
meaning of an email. This specification does not try to limit
modifications. We believe that being able to attribute modifications
to a particular entity will allow reliable blocking of malicious
intermediaries once they are identified.
6.4. Dealing with replays
Checking source and destination as recorded by the previous hop makes
many “DKIM replay” scenarios impossible.
There is more than one DKIM scenario? What are they?
It is possible to exclude all replays by determining if any DKIM2
header reports an expansion event (one incoming email resulting in
multiple further emails).
uh... This only works if the expansion platform is run by good actors.
While replay has been done that way, it isn't the interesting
arrangement, since that path is easily closed.
If not then you would expect that the
(original) hash of the email is unique and duplicates can be
rejected.
If a expansion event is recorded then receiving multiple copies would
not be a surprise.
To whom? And how is this, somehow, a useful point?
It will be necessary to use local policy to
assess whether the number of copies received is acceptable or not.
This does not deal with expansion that does not produce many copies to
the same platform.
Over time you may wish to develop a reputation for a DKIM2 identity
which is doing expansions and conclude that a specific number of
copies is to be expected. This can be used to refine local policy.
cf, Avoiding discussion about reputation in this protocol spec.
...
8. DKIM1/DKIM2 Interworking
*There is no interworking, These are entirely independent and
parallel systems.*
*This section should be written without reference to DKIM and only
in terms of incremental adoption and handling of non-support.*
Note that DKIM2 signed email can also be DKIM1 signed, and so systems
that are not DKIM2 aware can and will operate as they do at present.
DKIM2 capable servers will announce the capability in their initial
banner in the usual manner for SMTP extensions.
What effect does the SMTP server's making the announcement have? What
difference does it make?
DKIM is able to be implemented only in MUAs and does not require any
infrastructure support, though of course it is permitted. DKIM2
apparently requires deep infrastructure support at every hop along the
way. That makes it extremely fragile.
When a DKIM2 signed email is delivered to a server that does not
understand DKIM2 and leaves the DKIM2 ecosystem the DKIM2 specific
events can no longer be expected to occur. In particular any
failures to be deliver will be reported to the address in the
relevant return path and not back along the DKIM2 chain.
A DKIM2 signed email may be delivered to a server that understands
DKIM2 but if that server needs to forward the email elsewhere it may
find that there is no signing key available for the relevant domain
(recalling that the incoming email recorded the destination domain
and it is necessary for the next "hop" to match with that. In such a
case, once more the email will leave the DKIM2 ecosystem.
I don't understand. What signing key that it might not find? Signing
for what?
Refusing to allow an email to leave the DKIM2 ecosystem may be an
appropriate choice in some circumstances. If so then an appropriate
DSN should be created and passed back along the chain in the normal
manner.
Tossing this comment out, with no basis, detail, or precedent, is odd
and probably not very helpful, since it ultimately is a vote for making
email even more restrictive. Limiting who mail can go to is a tad
counter-productive.
...
d/
--
Dave Crocker
Brandenburg InternetWorking
bbiw.net
bluesky: @dcrocker.bsky.social
mast: @dcrocker@mastodon.social
_______________________________________________
Ietf-dkim mailing list -- ietf-dkim@ietf.org
To unsubscribe send an email to ietf-dkim-le...@ietf.org