ching as much as possible, and go with
[\s\(\)\-\.]+ (heck, I would probably go as far as just doing \W+ or \W*
to catch any characters the spammers might try to throw in).
--
Chris Petersen
Programmer / Web Designer
Silicon Mechanics: http://www.siliconmechanics.com/
Blade Servers: http://www.
> Lets not forget parentheses. Here is how I would have it look.
> [\s(\(|\-|\)\.]+
> Well I hope it is correct.
[\s\(\)\-\.]+
--
Chris Petersen
Programmer / Web Designer
Silicon Mechanics: http://www.siliconmechanics.com/
Blade Servers: http://www.siliconmechanics.com/c
1 ] ; then
/etc/init.d/spamassassin restart 2>/dev/null 1>/dev/null
echo "Restarted SpamAssassin"
fi
--
Chris Petersen
Programmer / Web Designer
Silicon M
de, especially when it's a LOT of code and you don't really know what
you're looking for.
--
Chris Petersen
Programmer / Web Designer
Silicon Mechanics: http://www.siliconmechanics.com/
Blade Servers: http://www.siliconmechanics.com/c
y tokens?
(if not, it should) What about message headers? Does it tokenize
rawbody or body? Does it tokenize only word-based characters, or would
something like "[EMAIL PROTECTED]@" become a token?
I'd honestly like some answers to these questions - I've asked before
but didn
ish email (no way to guarantee that if a server is in the
US that all of its users send english-only email), and here I'm counting
programming languages as non-english. I often mail code around to
friends/clients/whatever, and I can guarantee that more than 75% of it
would set off a spell check
eir
text is subject to trademark violations, since the right to use the
trademark is granted only to those who send messages compliant with
their definition of not-spam.
--
Chris Petersen
Programmer / Web Designer
Silicon Mechanics: http://www.siliconmechanics.com/
Blade Servers: http://www.s
;
same thing you'd use for invoking spamassassin or any other kind of
filter.
If you have virtual users, you'll need to look at the AuthCourier.pm
module submitted to the courier-users mailing list. It makes spamd work
with courier's authdaemon and properly handles homedirs for u
> How many do think would end up with an overly outgoing well blessed
> healthy abit thin gorilla with a huge desire to descramble tv while
> becoming financially secure and well insured
don't forget well-endowed in both male and female aspects!
--
Chris Petersen
Programmer
that get
tossed into the bunch.
--
Chris Petersen
Programmer / Web Designer
Silicon Mechanics: http://www.siliconmechanics.com/
Blade Servers: http://www.siliconmechanics.com/c292/blade-server.php
1U Servers: http://www.siliconmechanics.c
#x27;ve done serious regex stuff
(and I used to do it for a living).
--
Chris Petersen
Programmer / Web Designer
Silicon Mechanics: http://www.siliconmechanics.com/
Blade Servers: http://www.siliconmechanics.com/c292/blade-server.php
1U Servers: http://www.si
rds
score WORDWORD_10 .5
rawbody WORDWORD_15
/(?:\b(?!=(?:from|even|more|were|with)\b)[a-z]{4,12}\s+){15}/
describe WORDWORD_15 string of 15+ random words
score WORDWORD_15 2.5
mization and not grab backreferences.
--
Chris Petersen
Programmer / Web Designer
Silicon Mechanics: http://www.siliconmechanics.com/
Blade Servers: http://www.siliconmechanics.com/c292/blade-server.php
1U Servers: http://www.siliconmechanics.com/c2
he message is spam
or not."
--
Chris Petersen
Programmer / Web Designer
Silicon Mechanics: http://www.siliconmechanics.com/
Blade Servers: http://www.siliconmechanics.com/c292/blade-server.php
1U Servers: http://www.siliconme
> Would this regex make more sense?
> /([a-z]{4,12}\s){12,}/
Yes. though I used:
/(\b[a-z]{4,12}\s+){12}/
notice the initial /b, and there's no need to make SA continue to search
beyond the "minimum" match, so leave off the , in the last {} cluster.
--
Chris Pete
like a/and/the? It'd be easy for spammers to get around, but at least
it would keep them out of inboxes for awhile.
--
Chris Petersen
Programmer / Web Designer
Silicon Mechanics: http://www.siliconmechanics.com/
Blade Servers: http://www.siliconmechanics.com/c292/blade-server.php
1U S
> The whitelist part is a misnomer. It's an automatic score adjuster
> (white/black-list if you want).
I realize this. Just figure that the name should be more informative.
Better yet, shouldn't it be somehow tied to the bayes DB? These
messages are correctly scoring "0% chance of spam" from B
I recently started receiving spam addressed from someone on one of the
mailing lists I'm on, and since at about that time, my own address
started "sending" spam, we determined that the web archive of the list
had been spidered. Anyway, now, whenever I get mail from him, SA has
tagged it as spam du
Someone recently posted a spamd modification that allows it to access
courier mta's authdaemon to properly find home directories, etc for
virtual users. It was requested that he send it over to you guys, but I
didn't see anything float through this list about it. Has it been
received? It would b
> I'm too embarrassed to tell people I use pico...
I was trying to avoid the editor-war, but I have to say that I'm right
there with you (though when I can, I use nano because it has a few more
features but still keeps the nice EASY TO USE interface).
Then again, when I really want to code, I u
> 1) put the full URL of the canonical source into the file itself, so
>people know from where to get updates
Or, for those who like a little automation:
#!/bin/sh
wget -N http://www.merchantsoverseas.com/wwwroot/gorilla/bigevil.cf \
-O /etc/mail/spamassassin/bigevil.cf \
2>&1 | gr
With spammers resorting to misspellings, has anyone thought of combining
the bayes token stuff with soundex (or something similar) matching?
I don't know much about how bayes works, but this sounds like a decent
idea...
-Chris - getting tired of "Viagraaa" ads.
> Whatever I do, sa-learn will not accept it as spam. :(
It takes about 500-1000 learned spams before SA starts marking things
with the bayes filter. I've put at least that many in, and still only
about half of my spams have bayes markings in them (and about 10% of my
daily spam still gets throug
> FYI, works just fine. We've deployed it here using the full email as
> the key. We just made the field longer.
hmm, maybe I'll have to look into this. shouldn't be too hard to add
courier auth tool support to spamd so I wouldn't have to set up a
separate mysql database for this kind of thing.
I've asked about this before, but could never get a straight answer, so
I figured I'd try again...
my problem:
I use courier and maildrop, and have virtual users. The virtual users
are all owned by the same user/group, which obviously has a different
homedir than the virtual users. I also have
> Why do we need to authenticate the user of spamc at all? Are we
> worried about a remote user running spamc on their box and forging mail
> through ours? A local user forging something through our box?
I think this was mentioned in an earlier thread, but as I understand it,
the worry is that
> Problem: Use SA on Aliases and / or virtusers AND real users combined.
yup, that one exactly.
> So far the only solution looks like using MySQL. (Which for reasons of my
> own I don't want to do right now.)
>From what I've seen, a simple solution would be to get spamc to pass a
couple more bi
> I doubt that passing a couple of variables to spamd would increase the
> overhead of spamc by anything noticeable. But I get your point. I'm
> just trying to figure out a way to make this thing work properly with my
> setup and that seemed like the easiest solution short of hardcoding
> spamd t
> The purpose of spamc is to be as lightweight as possible. Ideally
> spamc will not contain any features that aren't useful by most people.
I doubt that passing a couple of variables to spamd would increase the
overhead of spamc by anything noticeable. But I get your point. I'm
just trying to f
> Why not use an SQL database instead? That's what we do with vpopmail
> (although I'd love for qmail-scanner to have support for virtual users, oh
> well) and it works just dandy.
sql database for what? authentication? Or everything? The only reason
these users are "virtual" is because they'
ts $ENV{HOME} var to
spamd...
why not just let spamc handle more of the normal spamassassin commands,
anyway?
-Chris
On Wed, 23 Oct 2002, Chris Petersen wrote:
> To: [EMAIL PROTECTED]
> From: Chris Petersen <[EMAIL PROTECTED]>
> Subject: [SAtalk] spamd and virtual users
> Date
I've recently run into an issue... I use courier-mta's userdb auth to set
up virtual accounts for a few domains I host on my machine. This is nice,
since I don't need to create system accounts on my machine for people who
have no right to be in there. I finally figured out why I hadn't been
azoogle.com should be added to the global "bad hosts" list. They don't
seem to forge their headers, so a simple from/body check should handle it.
and here they go, off to my server's blacklist.
-Chris
---
This sf.net email is sponsored by:
> For my personal use (in qmail) if I can have it also look at "Delivered-To:"
> then my problem will be solved. I'm not sure how to handle this in sendmail,
> but if someone can help me figure out where SA is determining the recipient of
> a message I'd like to try to fix this myself by making i
> The solution: Change the way it reports on the subject line, and let it
> all through.
That's how I handle most of my stuff...
> In PerMsgStatus.pm (which is where the subject line is changed) I am
> adding two additional variables: _SPLV_ and _SPLG_.
> Spamlevel = hits/threshold (so hits of 1
You'd think that "increate penis size" would trigger some rule or another.
Maybe freehostchina.com should also be added for some score or another,
too? (I don't know how legit of a service this is, but most things coming
out of chinese webhosts these days aren't good).
-Chris
-- F
> So if anyone else thinks this would be useful what other categories are
> there? Here is what Postini offers:
I really like this idea.
as you said, it could be used to NOT filter out certain things... so if
you want "get rich quick" mail, you dis/enable it in your user file and
those scores
> > Thanks for releasing the update to 2.31. Is there an ETA on RPM
> > availability?
> >
> ftp://ftp.kluge.net/pub/felicity/SRPMS/spamassassin-2.31-1.src.rpm ...
> I usually suggest people rebuild from the SRPM since there may be
> different versions of perl involved and such.
Is there ever
I just noticed that I'm still running 2.2, and when I went to download
the latest rpm, I noticed that the site still only has 2.2... Has
anyone made rpm's of 2.3 or later?
And would it be possible for some rpm-knowledgeable person on the dev
team to move the spec file into the .tar.gz branch so
A friend of mine recently suggested the idea of using a "mousetrap" email
address for detecting spam. I've heard of this technique being used for
tracking spam, but it had never occurred to me that I could use it to catch
spammers.
Basically, the idea is to put an email address on a web page, ei
> > 0.01 * 10^34 = 10^32 times. at 1,000,000,000 tries per second, that
> > will only take you 10^23 seconds = roughly the age of the universe.
>
> Not to mention the challenge of coming up with 10^32 unique intelligible
> ways of talking about penis enlargement, multilevel marketing, and wild
>
> Subjects being slightly different shouldn't be a problem because you can do
> soundex or "like" searches when you have the data set.
good point. advanced comparisons like that would help a lot.
> I was debating the reply-to and from but maybe it's best just ot use all of
> them for now. Aw
> It's just an old habit. When I learned SQL I was taught (mostly from
> the big SQL books) and of course the little black book of normalization,
> _Handbook of Relational Database Design_ that table columns should try
> to be unique yet understandable.
ahh. I started db stuff with filemaker (a
> 1) Razor uses SHA1, not MD5.
ah, noted.
> 2) Either way, while you're correct (you _can_ have multiple inputs
>with the same resulting hash), it's very unlikely to find two sets of
>different data with the same hash output. So in reality, MD5/SHA1/etc
>aren't unique, but they're u
> Now I really want to do this. I'll see what I'm up to this weekend. :-)
heh, it all looks good to me. I think I'm just not quite sure what you're
up to (that, and understores in field names confuse me for some reason ;).
> What really can you track with this besides scoring and the correla
> One thing I want to do is write a little C program that connects to Postgres
> (or Perl but with a C client just like spamc/d) and reports on the tests that
> *all* messages score on.
wouldn't it be easier to integrate this into spamd? You'd already have
your db client set up that way.
> F
> Is it a properly formatted header according to the relevant RFCs? If
> not, this entry in an ACL in my exim.conf rejects it at SMTP time.
not sure there. presumably it's all ok.
> If you want, how about coming up with a test like exim's that looks
> for syntactic validity of the header. All
> It depends on your setup. If each user is invoking spamassassin
> directly form procmailrc, then it's no problem. If spamd is running
> as root (or some other user), then there can be security concerns,
> especially since some of the rules require an eval.
aha, that makes sense, then...Ev
like the subject says... This is one of the most common spam recipients
that I receive... would be nice to get it added to the master list.
also, what's the reasoning for not letting users define filtration regex's
in their user files?
-Chris
__
49 matches
Mail list logo