On Tue 28/Dec/2021 12:11:15 +0100 Hans-Martin Mosner via mailop wrote:
Am 28.12.21 um 11:08 schrieb Alessandro Vesely via mailop:

Yeah, we inevitably fall back to IP address lists.  Perhaps not so much because it's easier to outline a perimeter using numbers than names, but because it's rather immediate to operate on the former.  A set of good names would sound like a meaningful friendly region, immune from changes of ISP.

OTOH, if it were possible to ascribe each nastiness to its actual culprit, we'd still need a policy to treat them effectively.  IME, banning for a period, à la fail2ban, requires too much looking after exceptions.  But allowing to each individual to run a million-email spam act even just once in a lifetime is obviously too much.  I don't think a military approach would do much better.

I'm working on a reputation based system which would use a p2p network to transmit reputation opinions very quickly, allowing each node's policy to decide who to trust and what actions to take.


I've been dreaming of such kind of system, but never dared to actually design something real, let alone coding. It was about a decade ago, and there was a IETF mailing list about domain reputation. However, it was soon made clear that the IETF would spend its energy detailing how to communicate reputation, not how to compute it.[*] That effort ended up publishing RFCs 7070-7073.


Transmitted information would consist of "statements" stating something that may or may not be true, and "opinions" signed (pseudonymously) by participants expressing agreement or disagreement with statements (using a range of -3..3). The subjects of statements may be IP ranges (special case is single IP address), AS numbers, domain names, e-mail addresses, URLs, or hashes of such subjects (mostly to protect privacy for exploited e-mail addresses).

Some examples:

  * Statement "spammer(frauds...@gmail.com)", opinion "certainty=3,
    valid_from=2021-12-28, valid_for=30, signature=xyz"
      o This expresses that "xyz" firmly believes that frauds...@gmail.com is a
        spammer. The opinion should be considered valid for 30 days starting 
today.
  * Statement "spammer(tana.it)", opinion "certainty=-3, valid_from=2021-12-28,
    valid_for=365, signature=vesely"
      o This expresses that tana.it is certainly not a spammer
  * Statement "spammer_friendly(AS202306)", opinion "certainty=2,
    valid_from=2021-12-28, valid_for=30, signature=xyz"
      o xyz is fairly sure that the operators of AS202306 don't give a flying
        f*k about spammers on their network
  * Statement "exploited(#Tu6sYF3pYtQFIrKr3Sktx4innT47Jk7jMAHHhsg5ZGU=)",
    opinion="certainty=3, valid_from=2021-12-28, valid_for=7, signature=xyz"
      o The resource which hashes to this value is believed to be exploited by
        spammers. The signer assumes that this would be fixed within a week.
  * Additional statement types would be more database-like, to help in
    implementing policies and to report abuse:
      o "dynamic()" to maintain a list of dynamic IP address ranges and domains,
      o "asn(IP,AS)" to express that a given IP range belongs to some
        autonomous system,
      o "abuse(IP|Domain,EMail|URL)" to express that abuse complaints for a
        given IP range or domain can be sent to some e-mail address or entered
        into a web form. Negative opinions about such "abuse()" statements
        would state that these abuse contacts seem to be non-functional.
      o "signer()" makes it possible to publicly express a network of trust, so
        if I sign a statement "signer(vesely)" with a certainty of 2 I state
        that I consider Alessandro's opinions valuable. He may sign the same
        statement with a certainty of 3 unless he doesn't trust his own
        opinions, which would be a bit strange.
  * It should be possible to store private opinions (for example to whitelist
    certain resources) that are meant to determine local policy decisions but
    should not be shared with the network.


If you're not yet at an advanced coding stage, I'd suggest to review that series of RFCs. They define a media type and an HTTP-based transport. At least, as the summary of some three years of discussions on that topic, they'd provide for an interesting reading.


Nodes may reject e-mails matching "spammer()" or "exploited()" statements signed by trusted peers, and may temp reject mails matching a "spammer_friendly()" statement or one where there's a majority of bad opinions but not sufficient trust in the signer of these opinions. Temp rejection may be enough to curb a wave of spam emanating from one resource, reducing that million-email spam run to hopefully a few hundred or thousand delivered mails.


That depends on how quick bad reputation spreads.

Another use would be to whitelist messages from authors recommended by trusted peers. Hopefully, positive reputation lasts longer that its negative counterpart; "longer" in the sense that it may prove to be useful for a longer time. A spamming domain often burns quickly, while good domains are steadily used (although an impromptu exploit can always strike.)


This is all at a very early stage, I'm working on the P2P aspect right now, will integrate a milter and/or postfix policy daemon interface next year, and deploy to the systems under my control for testing in a friendly environment. Later on I'll need to harden the P2P network against malicious actors (as the system is intended to be publicly joinable, spammers might want to disrupt its organization) and integrate user friendly management interfaces (a command line interface plus some web stuff).


The emerging P2P aspect is indeed the most interesting part. Peers should in turn earn a reputation, which could be expressed as a coefficient to be given to their rates when computing averages.


A spam reporting helper may be a later add-on application. This would utilize the database aspects to decide who to send abuse reports to, and what mechanisms to choose. Think distributed spamcop.


Reporting addresses are now easily available via RDAP on numbers. I get a roughly 70~80% successful lookups that way.


Best
Ale
--

[*]
https://mailarchive.ietf.org/arch/msg/domainrep/8ZiKu9zWdE69TWw2Hit6XW2Vc0M/










_______________________________________________
mailop mailing list
mailop@mailop.org
https://list.mailop.org/listinfo/mailop

Reply via email to