On Wed, 08 Dec 2004 18:39:15 -0500, Lonnie Princehouse wrote: > Regular expressions. > > It takes a while to craft the expressions, but this will be more > elegant, more extensible, and considerably faster to compute (matching > compiled re's is fast).
I'm already doing that with the rehmac regex. I like your idea for making it more readable, though. Looking for permutations of the IP address gives much more bang for the line of code than most host only regexes since it is ISP independent. At least one ISP uses roman numerals to code the IP for their dynamic addresses! I tried matching a custom regex computed from the IP, but compiling the regex for each test was too slow. I could keep adding more patterns, but I was hoping for a tool that "learns" from a database of preclassified examples how to recognize the pattern. And the resulting data would be reasonably compact. I don't ask for much, do I? A Bayesian classifier would have too big of a database, I think. I've seen neural nets do amazing things with only 100 or so neurons - a small weight database. But they are slow in software. I have posted 10K preclassified (by current algorithm) examples here: http://bmsi.com/python/dynip.samp -- http://mail.python.org/mailman/listinfo/python-list