Using SpamAssassin, but not for spam

Andrews, Rick 29 Oct 2004 01:16:28 -0000

Greetings,

I'm trying to investigate whether SpamAssassin can be used in a non-spam
application that we're trying to build. I've read lots of stuff on the
website but I'm still not sure. I thought I would ask you, the experts.


The application needs to determine whether a certain domain name is
"similar" to another domain name. We have a list of known domain names, and
occasionally want to compare a "target" domain name to see if it is similar
to any of the known domain names. The target might contain replacement
characters ("1" instead of "I" or "L", zero instead of "O", gratuitious dots
or hyphens, etc.) in much the same way that spammers try to get past spam
filters. That's why I thought SpamAssassin might be appropriate. To give an
example, we want to automatically detect that "my-d0m.a1n_name.com" is very
close to "mydomainname.com".

But from what I've read, I think it may not be appropriate for several
reasons:

1) We probably would have much more ham (known domain names) than spam
(close to a known domain name, but not legal)

2) We wouldn't have large amounts of ham or spam to feed through
SpamAssassin to enable it to learn and improve

3) The "target" domain name would in most cases be a single token as far as
SpamAssassin is concerned; unlike an email which likely contains hundreds of
tokens from which to decide if it is spam

What do you think? Would it take a lot of work to adapt SpamAssassin for
this application? Does it seem like an appropriate tool to use?

Thanks in advance,

-Rick

Using SpamAssassin, but not for spam

Reply via email to