I have a function that recognizes PTR records for dynamic IPs. There is
no hard and fast rule for this - every ISP does it differently, and may
change their policy at any time, and use different conventions in
different places. Nevertheless, it is useful to apply stricter
authentication standards to incoming email when the PTR for the IP
indicates a dynamic IP (namely, the PTR record is ignored since it doesn't
mean anything except to the ISP). This is because Windoze Zombies are the
favorite platform of spammers.
This is roughly it.... you'll have to experiment and find the right numbers for different pattern matches, maybe even add some extra criteria etc. I don't have the time for it right now, but I'd be interested to know how much my code and yours differ in the detection process (i.e. where are the return values different).
Hope the indentation makes it through alright.
#!/usr/bin/python
import re reNum = re.compile(r'\d+') reWord = re.compile(r'(?<=[^a-z])[a-z]+(?=[^a-z])|^[a-z]+(?=[^a-z])') #words that imply a dynamic ip dynWords = ('dial','dialup','dialin','adsl','dsl','dyn','dynamic') #words that imply a static ip staticWords = ('cable','static')
def isDynamic(host, ip): """ Heuristically checks whether hostname is likely to represent a dynamic ip. Returns True or False. """
#for easier matching ip=[int(p) for p in ip.split('.')] host=host.lower()
#since it's heuristic, we'll give the hostname #(de)merits for every pattern it matches further on. #based on the value of these points, we'll decide whether #it's dynamic or not points=0;
#the ip numbers; finding those in the hostname speaks
#for itself; also include hex and oct representations
#lowest ip byte is even more suggestive, give extra points
#for matching that
for p in ip[:3]:
#bytes 0, 1, 2
if (host.find(`p`) != -1) or (host.find(oct(p)[1:]) != -1): points+=20
#byte 3
if (host.find(`ip[3]`) != -1) or (host.find(oct(ip[3])[1:]) != -1): points+=60
#it's hard to distinguish hex numbers from "normal"
#chars, so we simplify it a bit and only search for
#last two bytes of ip concatenated
if host.find(hex(ip[3])[2:]+hex(ip[3])[2:]) != -1: points+=60
#long, seemingly random serial numbers in the hostname are also a hint #search for all numbers and "award" points for longer ones for num in reNum.findall(host): points += min(len(num)**2,60);
#substrings that are more than just a hint of a dynamic ip for word in reWord.findall(host): if word in dynWords: points+=30 if word in staticWords: points-=30
print '[[',points,']]' return points>80
if __name__=='__main__': for line in open('dynip.samp').readlines()[:50]: (ip,host) = line.rstrip('DYN').split()[:2] if host.find('.') != -1: print host, ip, ['','DYNAMIC'][isDynamic(host,ip)]
-- Mitja -- http://mail.python.org/mailman/listinfo/python-list