On Sat, 2 Feb 2013, Eliezer Croitoru wrote:
I wrote something in ruby which actually works fine as a starter.
#code start
spam_content = "the long part from the mail".force_encoding("Windows-1255")
template_hebrew_chars = 270
def hebrew_char(char)
if (223..251).member?(char.unpack("H*")[0].hex)
return true
elsif (192..203).member?(char.unpack("H*")[0].hex)
return true
elsif (205..219).member?(char.unpack("H*")[0].hex)
return true
end
return false
end
counter = 0; spam_content.each_char {|char| if hebrew_char(char);counter += 1
;end;}
if counter == template_hebrew_chars
puts "this is a spam"
else
puts "might not be a spam"
end
##code end
Now *that* might be possible in plain SA rules without a plugin: count the
number of characters in the message body, and the number of characters
that fall in a given range (e.g. those that are hebrew glyphs), and
calculate the percentage. I *think* you can do math in meta rules...
However, a plugin would be _much_ more efficient than something like:
body __HBRW_CHARS /[\xc0-\xcb\xcd-\xdb\xdf-\xfb]/
tflags __HBRW_CHARS multiple
body __TOTAL_CHARS /\S/
tflags __TOTAL_CHARS multiple
meta __HBRW_PCT ((__HBRW_CHARS * 100) / __TOTAL_CHARS)
meta HBRW_SPAM (__HBRW_PCT < 50) && __HBRW_ENCODING
I don't know whether the division in __HBRW_PCT or the less-than
comparison in HBRW_SPAM would work, that's totally off the top of my head
and untested. I also leave the __HBRW_ENCODING rule as an exercise for the
student. :)
--
John Hardin KA7OHZ http://www.impsec.org/~jhardin/
jhar...@impsec.org FALaholic #11174 pgpk -a jhar...@impsec.org
key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
For those who are being swayed by Microsoft's whining about the
GPL, consider how aggressively viral their Shared Source license is:
If you've *ever* seen *any* MS code covered by the Shared Source
license, you're infected for life. MS can sue you for Intellectual
Property misappropriation whenever they like, so you'd better not
come up with any Innovative Ideas that they want to Embrace...
-----------------------------------------------------------------------
10 days until Abraham Lincoln's and Charles Darwin's 204th Birthdays