[SAtalk] Integration with Distributed Checksum Clearinghouse?

Jonathan Bradshaw Tue, 12 Feb 2002 08:15:16 -0800

Is anyone looking at integration with Distributed Checksum
Clearinghouse? I'm using it as an additional filter from procmail right
now but it would be nice to integrate the detection and reporting.


For anyone not familiar, DCC information is at
http://www.rhyolite.com/anti-spam/dcc/ -- It is similar to Vipul's Razor
but more advanced.

-- Cut Here --

The DCC or Distributed Checksum Clearinghouse is a system of clients and
servers that collects and count checksums related to mail messages. The
counts can be used by SMTP servers and mail user agents to detect and
reject or filter unsolicited bulk mail or spam. DCC servers can exchange
common checksums. The checksums include values that are constant across
common variations in bulk messages, including "personalizations."

The basic idea of the DCC is that if mail recipients could compare the
mail they receive, they could recognize and deal with bulk mail. A
clearinghouse server totals reports of checksums of messages from
clients and answers queries about the total counts for individual
checksums. Each recipient decides independently how to handle each bulk
message. The recipient of a message or a DCC client can report and ask
about the total counts for several different checksums for each message.
If one or more totals for a message are higher than thresholds set by
the client, a DCC client that is part of a SMTP server can log, discard,
or reject the message with an SMTP 400 or 500-series status value. DCC
clients that are parts of mail user agents can discard, file, or score
messages based on their "bulkiness."

With restrictions on the sources of checksums, you can be confident that
only the checksums of unsolicited mail are in a DCC database. An
isolated DCC server fed by private "spam traps" or "spam bait" does not
need a white list. However, unless you restrict the sources of checksums
counted by all servers in a group of cooperating servers, they detect
solicited as well as unsolicited bulk mail. That usually mandates the
use of white lists to ensure that solicited bulk mail is not rejected.

Because simplistic checksums of spam would not be very effective, the
main DCC checksum is fuzzy and ignores various aspects of messages. The
fuzzy checksum will need to be changed as spam evolves. Since the DCC
started being used in production in late 2000, new checksum schemes have
been developed and distributed several times.

Unless used with isolated servers and so losing some of its power, the
DCC does cause some additional network traffic. However, the
client-server interaction for a single mail message consists of
exchanging a single pair of UDP/IP datagrams. That is often less than
the several pairs of UDP/IP datagrams required for a single DNS query.
Most SMTP servers make at least one DNS query for every message to check
the envelope Mail_From value and often several more. As with the Domain
Name System, DCC servers can be placed near active clients to reduce the
DCC network costs. DCC servers should exchange or "flood" reports of
checksums, but only the checksums of messages that are frequently seen.
Since the vast majority of mail is sent to only a few people, only a
small fraction of reports of checksums need to be exchanged by servers.
That significantly reduces the network bandwidth and.disk storage needed
by a DCC server.



_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

[SAtalk] Integration with Distributed Checksum Clearinghouse?

Reply via email to