I understand that SA might not be exactly what I'm looking for in its
present form, but I know it's got to be close.  Here in more detail is
exactly what I'd like to do.

The Perl script I have in mind could use the Net::POP3 module to connect to
a given server, log on to it, and issue RETR or TOP commands to retrieve the
text of each message in a box.  It would do this for each box, and that
would end step #1.  Notice that the "RETR" or "TOP" commands do _not delete
the messages that are being retrieved; only the DELE command (followed by
QUIT with no intervening RSET) will remove them.

This would be a good place to subject the item to SpamAssassin.  If the
message is thrown out as spam it dies here.  Otherwise, pray continue...

Now it will extract a sample from each message, and it will build an array
of the samples.  

It will then compare each sample to each sample other than itself (not sure
what algorithm to use here but it's basically like a full-text-search
function) to determine closeness-of-fit.  Samples which have a "high"
closeness-of-fit to "several" other messages would be considered spam and
would be deleted.

The "honeypot" idea thus described relies on the fact that when spammers
send out a message they send more-or-less the same message to bezillions of
addresses, including not just one of yours but all the ones they can find.
They'll also pick up addresses that you might never use for anything ..
these accumulate spam, so that anything which has a "even not-so-high"
closeness-of-fit to one of those messages is likely to be spam... or to put
it another way, "anything with a high closeness of fit to any one of these
messages is spam."

A different variation of the honeypot approach would be to take samples of
the last "n" messages that have flown through a server, to see if the
incoming message has a strong resemblance to ones that have recently come
in, especially to different people. An in-line mail filter could then
"learn" to recognize the mail as probable-spam and throttle off the stream
so that fewer people get it.  But that's off-topic to what _I need to do.

The filtering program that I'm wanting to write _will run on an Internet
connected server (far away in New Hampshire) and will communicate with the
ISPs mail host (in Iowa I think) over high speed backbones.  It will flush
and discard the mail that we don't want to see before we, logging in over a
phone line, connect to the same [Iowa] computer to retrieve the mail.



At 11:20 AM 4/3/02 -0500, Nick Fisher wrote:
>I think you have not totaly understood what SA does or how it does it. The
>idea that you can just sample the mail and tell if it is spam is not a
>design goal of SpamAssassin. SA was designed to work on an entire
>message.... as far as I can tell any how. If I'm wrong here I'm sure someone
>will correct me.
>Basicly I think you need to look at the whole body of the message without
>downloading it. There are four ways I can see of doing this...
>
>1) If you can access your mail boxs using IMAP you could write a
>SpamAssassin IMAP robot. The robot would check your mail and delete any
>spam. Problem being that you need a external server to host the robot on.
>Talk to your ISP
>
>2) Use a pay service like SpamCop. There is talk of a SpamAssassin service
>but nothing is there yet.
>
>3) Change you setup. Go to a different ISP/Hosting company (With better Spam
>protection), get a cable modem, buy and host a dedicated server etc
>
>4) Write a cut down version of SA that just works on headers. And TOPs of
>messages. It would not be as accurate but it might cut down on the ammount
>you have to download.
>
>       Nick
>
>> -----Original Message-----
>> From: [EMAIL PROTECTED]
>> [mailto:[EMAIL PROTECTED]]On Behalf Of
>> Sundial Services International, Inc.
>> Sent: Wednesday, April 03, 2002 10:44 AM
>> To: Rob McMillin; [EMAIL PROTECTED]
>> Subject: Re: [SAtalk] Using SpamAssassin if you don't own the mail
>> server ?
>>
>>
>> Gentlebeings... you are all on high-speed links and you're
>> thinking "set up
>> your own POP server..."  As though that were no problem.
>>
>> And no, it's not that I don't know just how to do just that.  But my
>> particular circumstances preclude that approach.  In effect I
>> need to (doing
>> the entire job non-root from an ordinary user account) periodically query
>> the mail on a server ... run it through filtering but also
>> cross-comparison
>> ... and delete the mail I don't want to log on [think TELEPHONE DIALUP,
>> think POOR LINES] and laboriously retrieve.
>>
>> And I really -do- need to do it _that _way.
>>
>>
>>
>>
>> At 07:26 AM 4/3/02 -0800, Rob McMillin wrote:
>> >Sundial Services International, Inc. wrote:
>> >
>> >>Here's my problem.  We use an external ISP to handle our mail,
>> and of course
>> >>we are getting pummeled with spam so fast that the mailbox can fill up
>> >>within hours.  We use a different ISP to handle the web-site
>> and can set up
>> >>programs on that.
>> >>
>> >>What I want to do, unless it has already been done, is to
>> construct a Perl
>> >>script that can then be run periodically as a cron-job.  What
>> this script
>> >>would do is to interrogate each of the mailboxes we use,
>> determine what's
>> >>spam among them, and delete those messages so that only they
>> come down the
>> >>line when we download mail.
>> >>
>> >Ha! Fancy you should mention it. I have a friend who is in a very
>> >similar situation. He has a publically visible mail address that he
>> >*cannot* get rid of -- it's his business mail, and has been visible on
>> >his site since he opened shop over five years ago. It's hosted at the
>> >mailserver of the company that bought his business. He gets 100+ spams
>> >to this address daily. What you want is to set up a separate server with
>> >its own POP3 server, and use fetchmail with the following ~/.fetchmailrc
>> >
>> >poll <pop3_servername> protocol pop3 username <username> password <pwd>
>> >mda "/usr/bin/procmail -d %T"
>> >
>> >for each affected account, assuming you use POP3 to pick up your mail.
>> >Then, follow the usual installation/delivery instructions for
>> >SpamAssassin, and change all your mail clients to point to the new mail
>> >server for delivery. It's not quite what you're asking for, because it
>> >sounds like you're using IMAP for delivery, but I think it's the best
>> >you can do under the circumstances.
>> >
>> >--
>> >    http://www.pricegrabber.com
>> >          "We are smarter individually." -- Larry Niven
>> >
>> >
>> >
>> >
>> >
>> >
>>
>>  Sundial Services International Inc.
>> =============================================================
>> - Scottsdale AZ  (480) 946-8259; fax (480) 874-2068
>> - Innovative solutions for complex database issues!
>> - http://www.sundialservices.com/
>> - PGP public key at http://www.sundialservices.key/pgp.key
>>
>>
>> _______________________________________________
>> Spamassassin-talk mailing list
>> [EMAIL PROTECTED]
>> https://lists.sourceforge.net/lists/listinfo/spamassassin-talk
>>
>
>

 Sundial Services International Inc.
=============================================================
- Scottsdale AZ  (480) 946-8259; fax (480) 874-2068
- Innovative solutions for complex database issues!
- http://www.sundialservices.com/ 
- PGP public key at http://www.sundialservices.key/pgp.key


_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to