Re: [WL] Re: [SAtalk] More obfuscation

2004-01-21 Thread David B Funk
On Tue, 20 Jan 2004, Charles Gregory wrote: > Right now, there would be no statistics, because the text obfu has just > started. But as a side note, we don't have the disk space to run Bayes for > all our users though I'm getting awfully tempted to talk the boss into > an extra disk or two. So

Re: [SAtalk] More obfuscation

2004-01-21 Thread Charles Gregory
On Tue, 20 Jan 2004, Robert Menschel wrote: > CS> I'm not sure where the post is, but about 3 weeks ago I think Dallas > CS> put a semi-end to the spell-checker debate :) Perhaps I need to re-clarify. The idea is NOT to treat mis-spelled words as spam. The idea is to find specific 'close matches'

Re: [WL] Re: [SAtalk] More obfuscation

2004-01-20 Thread Lucas Albers
detcting obfuscation: html garbage tags:done normal language letter frequency:easy to do, easy to get by just modify random keyword to generate same frequency as english words. This would still catch the stupider spammers doing bayes poisoning. Detect poisoning attempt, and reject an addition to

Re: [WL] Re: [SAtalk] More obfuscation

2004-01-20 Thread Charles Gregory
On Wed, 21 Jan 2004, Sidney Markowitz wrote: > Does anyone who is concerned about the obfuscation have any statistics > to show that it really is a problem for the current rules plus network > tests plus a well-trained Bayes? Right now, there would be no statistics, because the text obfu has jus

Re: [WL] Re: [SAtalk] More obfuscation

2004-01-20 Thread Sidney Markowitz
Charles Gregory wrote: So I guess the question is, how 'expensive' would it be in terms of processing power There's also the question of how much benefit would it have. I recall someone trying out searching for close matches to spam words in a corpus and not getting very good results at picking u

RE: [SAtalk] More obfuscation

2004-01-20 Thread Rose, Bobby
Wouldn't DCC or Razor pick this up after some reports? -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Christopher X. Candreva Sent: Tuesday, January 20, 2004 5:12 PM To: [EMAIL PROTECTED] Subject: Re: [SAtalk] More obfuscation On Tue, 20 Jan

Re: [WL] Re: [SAtalk] More obfuscation

2004-01-20 Thread Charles Gregory
On Tue, 20 Jan 2004, Marcus Frischherz wrote: > But there is: there exists (at least in PHP) a function called > levenshtein, which calculates the similarity between two words. Surely > there must exist a perl equivalent to it. see: > http://at.php.net/manual/en/function.levenshtein.php So I g

RE: [SAtalk] More obfuscation

2004-01-20 Thread Chris Santerre
I'm not sure where the post is, but about 3 weeks ago I think Dallas put a semi-end to the spell-checker debate :) He ran one and the outcome wasn't so good. --Chris > -Original Message- > From: Charles Gregory [mailto:[EMAIL PROTECTED] > Sent: Tuesday, January 20, 2004 4:37 PM > To: [EM

Re: [SAtalk] More obfuscation

2004-01-20 Thread Christopher X. Candreva
On Tue, 20 Jan 2004, Marcus Frischherz wrote: > But there is: there exists (at least in PHP) a function called > levenshtein, which calculates the similarity between two words. Surely > there must exist a perl equivalent to it. see: > http://at.php.net/manual/en/function.levenshtein.php I wonder

Re: [SAtalk] More obfuscation

2004-01-20 Thread Bob Apthorpe
Hi, On Tue, 20 Jan 2004, Marcus Frischherz wrote: > Charles Gregory wrote: > > >I'm starting to see mail with TEXT obfuscation, such as: > > I heard you need viagrPa. > >Note the capital P thrown in to our favorite 'v' word. > >It is really beginning to look like we need a genuine spelling chec

Re: [SAtalk] More obfuscation

2004-01-20 Thread Christopher X. Candreva
On Tue, 20 Jan 2004, Charles Gregory wrote: > > I'm starting to see mail with TEXT obfuscation, such as: >I heard you need viagrPa. > Note the capital P thrown in to our favorite 'v' word. I was just about to post another one I received, same deal: http://www.westnet.com/~chris/Spam0120

Re: [SAtalk] More obfuscation

2004-01-20 Thread Marcus Frischherz
Charles Gregory wrote: I'm starting to see mail with TEXT obfuscation, such as: I heard you need viagrPa. Note the capital P thrown in to our favorite 'v' word. It is really beginning to look like we need a genuine spelling checker, or some sort of 'approximation' technology, if such exists.