John, Below are ansewers to your questions.
-----Original Message----- From: John W. Krahn [mailto:[EMAIL PROTECTED] Sent: Monday, May 07, 2007 12:57 PM To: Perl Beginners Subject: Re: array and hash patter matching Tim Wolak wrote: > Morning all, Hello, > I am working on a script that reads in /var/log/auth.log,, takes the ip > addresses puts them into a hash keeping track of how many times it finds > that address and compare it to addresses found in /etc/hosts.deny and > only write the addresses that are new in the file. So far I can get the > addresses from the log file no problem and write them to the deny file, > however I am struggling on how to compare the hash with an array for any > duplicate addresses. What is the best approach to take with this? [ Code reformatted to reflect the actual structure. ] use warnings; use strict; > open (LOGFILE, "/var/log/auth.log") or die "Can't open log file : $!\n"; > open (DENY, "/etc/hosts.deny") or die "Can't open log file: $!\n"; > > while (<DENY>) { > if ($_ =~ /Invalid user/ || /Failed password for/) { [Tim Wolak] Valid point understand. Why use "$_ =~" in front of the first match and not in front of the second match? Either use it for both or use it for neither (be consistent.) The file /etc/hosts.deny doesn't even contain those strings does it? man 5 hosts_access > push @origDeny, $_; > } > foreach $orig (@origDeny) { [Tim Wolak] I'm using the loop to capture and step through each line of the /var/log/auth.log file. Why are you using this foreach loop inside the while loop? If the file contains five IP addresses then the first one will be pushed onto @hosts 5 times and the second one 4 times and the third one 3 times, etc. > if ($off =~ /((\d+)\.(\d+)\.(\d+)\.(\d+))/) { [Tim Wolak] See above.. Why are you capturing five different strings when you are only using one? > push @hosts, $1; > } > } > } The two arrays you just populated are not used again after the while loop ends so what was the point? > close DENY; > while (<LOGFILE>) { > if ($_ =~ /Invalid user/ || /Failed password for/) { > push @offenders, $_; > } > } > foreach $off (@offenders) { > if ($off =~ /((\d+)\.(\d+)\.(\d+)\.(\d+))/) { > push @list, $1; > } > } > foreach $number (@list) { > if (exists $iplist{$number}) { > $iplist{$number} ++; > } else { > $iplist{$number} = "1"; > } > } [Tim Wolak] This is the kind of advice I was looking for. Why use three loops to do something that you only need one loop for? my %iplist; while ( <LOGFILE> ) { if ( /Invalid user|Failed password for/ && /(\d+\.\d+\.\d+\.\d+)/ ) { $iplist{ $1 }++; } } > open (DENY, ">>/etc/hosts.deny") or die "Can't open log file: $!\n"; > foreach $key (keys %iplist) { > if ($iplist{$key} > 5) { [Tim Wolak] Five because that is the limit for how many times the IP is appearing in the auth.log file and then the "attacker" will be put in the hosts.deny file. Why 5? > foreach $tim (@list) { > if ($tim !~ /$iplist{$key}/) { [Tim Wolak] I'm matching the so when the authentication attempts are multiple attempts that are logged I do not put 100 of the same IP addresses in the deny file. i.e. if its in the deny file already don't put it in there again.... Why are you trying to match the number in $iplist{$key} to the IP address in $tim? > print DENY "$key\n"; According to hosts_access(5) the /etc/hosts.deny file needs more on the line than just the IP address. man 5 hosts_access [ snip ] ACCESS CONTROL RULES Each access control file consists of zero or more lines of text. These lines are processed in order of appearance. The search terminates when a match is found. · A newline character is ignored when it is preceded by a backslash character. This permits you to break up long lines so that they are easier to edit. · Blank lines or lines that begin with a `#´ character are ignored. This permits you to insert comments and whitespace so that the tables are easier to read. · All other lines should satisfy the following format, things between [] being optional: daemon_list : client_list [ : shell_command ] daemon_list is a list of one or more daemon process names (argv[0] values) or wildcards (see below). client_list is a list of one or more host names, host addresses, patterns or wildcards (see below) that will be matched against the client host name or address. The more complex forms [EMAIL PROTECTED] and [EMAIL PROTECTED] are explained in the sections on server endpoint patterns and on client username lookups, respectively. List elements should be separated by blanks and/or commas. With the exception of NIS (YP) netgroup lookups, all access control checks are case insensitive. > } > } > } > > } > close LOGFILE; > close DENY; John -- Perl isn't a toolbox, but a small machine shop where you can special-order certain sorts of tools at low cost and in short order. -- Larry Wall -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] http://learn.perl.org/ -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] http://learn.perl.org/