Duncan Findlay wrote: DF> Clearly, we can not do this with EVERY combination, unless Craig has a DF> lot of CPU to spare. There are just under 400 rules right now. If we DF> ended up with 400 tests, there would be 79800 doubles and 10586800 DF> triplets.
We really don't care about *EVERY* combination, just the common ones. I bet many of the combinations never appear together, and it might even be impossible for some combinations to appear together (haven't checked). DF> So, assuming the GA runs in O(n) time, (which is not at all likely to DF> be true -- I'd guess O(n^2) if I had to), this would require 26668 DF> times longer to generate scores. DF> Of course this total would be less but still quite significant if DF> doubles and triples were added as they were seen, but still, I DF> estimate this would be extremely taxing on CPU. Shouldn't be too bad, I don't think. The GA is pretty efficient when evaluating genomes. Wouldn't take very long to turn spam.log and nonspam.log into a rule-combination-frequencies file either. C _______________________________________________________________ Don't miss the 2002 Sprint PCS Application Developer's Conference August 25-28 in Las Vegas -- http://devcon.sprintpcs.com/adp/index.cfm _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk