Re: Help beautify ugly heuristic code

2006-07-24 Thread Stuart D. Gathman
On Thu, 09 Dec 2004 00:01:36 -0800, Lonnie Princehouse wrote: > I believe you can still do this with only compiling a regex once and > then performing a few substitutions on the hostname. That is a interesting idea. Convert ip matches to fixed patterns, and *then* match the regex. I think I wo

Re: Help beautify ugly heuristic code

2004-12-10 Thread Stuart D. Gathman
On Fri, 10 Dec 2004 22:03:20 +, JanC wrote: > Stuart D. Gathman schreef: > >> I have a function that recognizes PTR records for dynamic IPs. There > Did you also think about ISPs that use such a PTR record for both dynamic > and fixed IPs? There seems to be a lot of misunderstanding about

Re: Help beautify ugly heuristic code

2004-12-10 Thread JanC
Stuart D. Gathman schreef: > I have a function that recognizes PTR records for dynamic IPs. There > is no hard and fast rule for this - every ISP does it differently, and > may change their policy at any time, and use different conventions in > different places. Nevertheless, it is useful to app

Re: Help beautify ugly heuristic code

2004-12-10 Thread Stuart D. Gathman
On Thu, 09 Dec 2004 00:01:36 -0800, Lonnie Princehouse wrote: > I believe you can still do this with only compiling a regex once and > then performing a few substitutions on the hostname. Cool idea. Convert ip matches to fixed patterns before matching a fixed regex. The leftovers like shaw cable

Re: Help beautify ugly heuristic code

2004-12-09 Thread Jeremy Sanders
On Wed, 08 Dec 2004 18:38:14 -0500, Stuart D. Gathman wrote: >> Here are the last 20 (which my subjective judgement says are correct): > > 65.112.76.15usfshlxmx01.myreg.net 201.128.108.41 [snip] > 80.143.79.97p508F4F61.dip0.t-ipconnect.de DYN Looks like you could do something like look

Re: Help beautify ugly heuristic code

2004-12-09 Thread Mitja
On Wed, 08 Dec 2004 16:09:43 -0500, Stuart D. Gathman <[EMAIL PROTECTED]> wrote: I have a function that recognizes PTR records for dynamic IPs. There is no hard and fast rule for this - every ISP does it differently, and may change their policy at any time, and use different conventions in diff

Re: Help beautify ugly heuristic code

2004-12-09 Thread Lonnie Princehouse
Doh! I misread "a" as host instead of ip in your first post. I'm sorry about that; I really must slow down. Anyhow, I believe you can still do this with only compiling a regex once and then performing a few substitutions on the hostname. Substitutions: 1st byte of IP => (0) 2nd byte of IP =>

Re: Help beautify ugly heuristic code

2004-12-08 Thread Stuart D. Gathman
On Wed, 08 Dec 2004 19:52:53 -0500, Lonnie Princehouse wrote: > I don't think a Bayesian classifier is going to be very helpful here, > unless you have tens of thousands of examples to feed it, or unless it We do have tens of thousands of examples to feed it. > The series of if host.find(...) li

Re: Help beautify ugly heuristic code

2004-12-08 Thread Lonnie Princehouse
I don't think a Bayesian classifier is going to be very helpful here, unless you have tens of thousands of examples to feed it, or unless it was specially coded to first break addresses into better tokens for classification (such as alphanumeric strings and numbers). The series of if host.find(...

Re: Help beautify ugly heuristic code

2004-12-08 Thread Stuart D. Gathman
On Wed, 08 Dec 2004 18:39:15 -0500, Lonnie Princehouse wrote: > Regular expressions. > > It takes a while to craft the expressions, but this will be more > elegant, more extensible, and considerably faster to compute (matching > compiled re's is fast). I'm already doing that with the rehmac rege

Re: Help beautify ugly heuristic code

2004-12-08 Thread Carlos Ribeiro
On 8 Dec 2004 15:39:15 -0800, Lonnie Princehouse <[EMAIL PROTECTED]> wrote: > Regular expressions. > > It takes a while to craft the expressions, but this will be more > elegant, more extensible, and considerably faster to compute (matching > compiled re's is fast). I think that this problem is p

Re: Help beautify ugly heuristic code

2004-12-08 Thread Lonnie Princehouse
Regular expressions. It takes a while to craft the expressions, but this will be more elegant, more extensible, and considerably faster to compute (matching compiled re's is fast). Example using the top five from your function's comments: . host_patterns = [ . '^1Cust\d+\.tnt\d+\..*\.da\.uu\

Re: Help beautify ugly heuristic code

2004-12-08 Thread Stuart D. Gathman
On Wed, 08 Dec 2004 18:00:06 -0500, Mitja wrote: > On Wed, 08 Dec 2004 16:09:43 -0500, Stuart D. Gathman <[EMAIL PROTECTED]> > wrote: > >> I have a function that recognizes PTR records for dynamic IPs Here >> is the very ugly code so far. >> ... >> # examples we don't yet recognize: >> ... >

Re: Help beautify ugly heuristic code

2004-12-08 Thread Mitja
On Wed, 08 Dec 2004 16:09:43 -0500, Stuart D. Gathman <[EMAIL PROTECTED]> wrote: I have a function that recognizes PTR records for dynamic IPs Here is the very ugly code so far. ... # examples we don't yet recognize: ... This doesn't help much; post example of all the possible patterns you h