Thanks for the help...it explains why the syntax was (as far as I could tell) OK yet it still didn't work properly. I should have diffed the output of my test script rather than relying on a visual grep late at night :-).
On 26 May 1997, Roderick Schertler wrote: > On Mon, 26 May 1997 23:40:35 +1000 (EST), Craig Sanders <[EMAIL PROTECTED]> > said: > > > > The database is a .db file created with 'makemap hash redir <redir' > > from [:space:]-delimited source input like the following: > [...] > > //.*riddler.com/Commonwealth/bin/statdeploy.* //www.taz.net.au/blank_ad.gif > > The problem is that makemap downcases the keys by default so > Commonwealth is commonwealth in the map. Use the -f flag when building > the map to disable this behavior. The Answer! Thanks! > Since you're always scanning the db linearly, though, using a DB map > isn't buying you anything. I'd just read the patterns from the text > file directly. I'm using the hashed db for speed and convenience. One advantage of the db file is that comments are stripped out by makemap which means i can have as many comments as i like in the source text but it wont slow down the script at all....quite important when on some of my squid boxes this script has to do 50000+ lookups per hour. Also, unless the tie function (or similar) can work with text files as well as db files, there will also be the overhead of opening & closing the file for every URL, plus the overhead of parsing each line into it's two fields... e.g. with one hundred entries in the file on a moderately busy machine like the one above, that would be 50,000 open & close operations per hour plus up to 5,000,000 line parsing operations (most URLs scanned WON'T match any of the patterns so the loop will have to run to completion. very rough calculations(*) from my squid log files indicate that around 10% of URLs are banner advertisements) per hour. I could just read the text file into an array but that would mean i was back where i started - having to restart squid when i make a change to the database. alternatively i could modify the script to respond to SIGHUP by re-reading the text file. (*) 'wc -l access.log' vs 'grep blank_ad.gif access.log | wc -l' about 10% of the entries in the access.log over the last month were advertising banners redirected to blank_ad.gif by my script. this is on my lightly-used squid box at home where i do most of my web browsing in non-commerical linux & 'weirdness' related areas. I don't block advertising on my big squid at work, but I would guess that the proportion would be much higher. To tell the truth, I didn't mind banner ads until they started using FLASHING animated gifs - whoever invented gif animations should be drawn and quartered very slowly over a hot fire. > Gratuitous unsolicited style tip #1: Don't put semicolons after a > closing brace except for do and eval blocks, and sub ref constructors. a bad habit, i know. it's easier to just put them in after every } rather than have to remember the exceptions where they're required. > Gratuitous unsolicited style tip #2: This code would more idiomatically > be > > print "$url==>" if $debug; > while (($key, $record) = each %redir_db) { > if ($url =~ s/$key/$record/) { > print $url; > last; > } > } > print "\n"; yes, that's much better. thanks. i knew there was a way of dropping out of the loop quickly without using an ugly $found variable but couldn't remember what it was. craig -- craig sanders networking consultant Available for casual or contract temporary autonomous zone system administration tasks. -- TO UNSUBSCRIBE FROM THIS MAILING LIST: e-mail the word "unsubscribe" to [EMAIL PROTECTED] . Trouble? e-mail to [EMAIL PROTECTED] .