Sorry to continue this off-topic thread, but Volker Kuhlmann pointed out a bug and some things to improve in the script I sent out, so I'm sending out a new version for the record (i.e. people searching the list archives with Google).
> :0 > md5sum=| perl -e 'while (($_ = <>) && !/^\r?\n$/) { if (/^[ \t]/) { next if $r; } >else { if (/^(date|from|subject|to|cc)\s*:/i) { $r = 0; } else { $r = 1; next } } >s/\r?\n$/\n/; print; } print "\n"; while (<>) { s/\r?\n$/\n/; print; }' | md5sum | >sed -e 's/ .*//' - You could probably save CPU cycles by using formail instead of perl for this, but I haven't tried that. > :0:$md5cache.lock > * !? ! fgrep -q "$md5sum" "$md5cache" && echo "$md5sum" >> "$md5cache" > dupes - Use "$LOCKEXT" instead of ".lock". - Unfortunately, this local lock only protects delivery, not the test itself. This is a bug. - You can save CPU cycles by putting the fgrep and echo commands in separate procmail tests so procmail can run the commands directly, without a shell. Here's a new version of the script. I hope this is correct now! SHELL=/bin/sh MAILDIR=.../Mail LOGFILE=$MAILDIR/Log md5cache=$MAILDIR/MD5 :0 md5sum=| perl -e 'while (($_ = <>) && !/^\r?\n$/) { if (/^[ \t]/) { next if $r; } else { if (/^(date|from|subject|to|cc)\s*:/i) { $r = 0; } else { $r = 1; next } } s/\r?\n$/\n/; print; } print "\n"; while (<>) { s/\r?\n$/\n/; print; }' | md5sum | sed -e 's/ .*//' LOCKFILE=$md5cache$LOCKEXT :0: * ? fgrep -q "$md5sum" "$md5cache" dupes :0ic | echo "$md5sum" >> "$md5cache" LOCKFILE ... etc ...