On Wed, 26 Nov 2003 12:58:19 -0500, Kris Deugau <[EMAIL PROTECTED]> writes:
> Pedro Sam wrote: > > hehe, we can give a state to each possible combination of HITS for the > > rules. So if rules 1, 3, 5, 7 hit, we give that a state, and if 2, 4, > > 6, 8 hit, we give it another state, and so on... I think they call it > > subset construction or something... > > Which is next to useless for even 30-40 rules. You would have to have > n![1] states for n rules- there is NO way to determine which individual > rules will trigger in advance. In the worst case, precomputing the transition table for the DFA is exponential with only one regexp. In practice, most regexps people care about are much more well behaved, so the result seems to be about linear in the sums of the lengths of all of the regexps. Even for ill-formed rules, the table can also be constructed lazily at runtime. See the dragon book for details on both algorithms. To keep track of all matching regexps, I annotate the DFA states with not one rule for backtracking, but a list of all matching rules. Then log them during interpretation and generate a union of all of those. Scott ------------------------------------------------------- This SF.net email is sponsored by: SF.net Giveback Program. Does SourceForge.net help you be more productive? Does it help you create better code? SHARE THE LOVE, and help us help YOU! Click Here: http://sourceforge.net/donate/ _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk