A thought on spammers oft-used sets of 'random' character lists in emails...an example:
-- gnqplleqhzblll u wfjmvfe upvxoi lwhm xqs flckwrtsmufx irwajksqsnw er wcfjgfmk jugxfq -- Seems to me that some tests can be made from these... body 10_CONSONENTS /[bcdfghjklmnpqrstvwxz]{10}/ score GW_10_CONSONENTS 1.0 body 9_CONSONENTS /[bcdfghjklmnpqrstvwxz]{9}/ score GW_9_CONSONENTS 0.9 body 8_CONSONENTS /[bcdfghjklmnpqrstvwxz]{8}/ score GW_8_CONSONENTS 0.8 body 7_CONSONENTS /[bcdfghjklmnpqrstvwxz]{7}/ score GW_7_CONSONENTS 0.7 body 6_CONSONENTS /[bcdfghjklmnpqrstvwxz]{6}/ score GW_6_CONSONENTS 0.6 body 5_CONSONENTS /[bcdfghjklmnpqrstvwxz]{5}/ score GW_5_CONSONENTS 0.5 These have not been tested yet... Some potential concerns: - Encoded messages will likely set this off (uuencode, binhex, etc.) - Are there many legitimate situations where 5+ consonents will be seen? - Will other languages (such as German and Welsh with long strings of consonents) be penalized for using this? - Can we determine any other sorts of patterns from spammers use of these? Any more thoughts? Greg -- Greg Webster - [EMAIL PROTECTED] In-Touch Software Corporation Ph: (604)278-0515 - Fax: (604)608-3112 ------------------------------------------------------- This SF.net email is sponsored by: SF.net Giveback Program. Does SourceForge.net help you be more productive? Does it help you create better code? SHARE THE LOVE, and help us help YOU! Click Here: http://sourceforge.net/donate/ _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk