Hello again, On Thu, 30 Aug 2012, Maarten Broekman wrote:
Some of the phishing content that I'm finding is resulting in hex dumps in the 10k+ character range and I think it's more dangerous to replace sections with '*' than to replace certain substrings with specific length wildcards.
This brings to mind a large proportion of our customers, who will happily send us a four megabyte PDF file to order a pack of CDs. I think it calls for a complete re-think. It seems to me that if signatures are of that size there must be a great deal of redundancy in them, and it might well be indicative of a flaw in the process design. I imagine that removing redundancy effectively will not be a matter of tinkering with a few character strings, but of tackling the issue more directly, possibly mathematically. Please would someone explain to me the use of "{7-8}"? I do not recognize it as valid regular expression syntax. According to the current ClamAV documentation (15 May 2012) repeat character counts are not supported: http://www.clamav.net/doc/latest/phishsigs_howto.pdf -- 73, Ged. _______________________________________________ Help us build a comprehensive ClamAV guide: visit http://wiki.clamav.net http://www.clamav.net/support/ml