Nicolas Rachinsky <[EMAIL PROTECTED]>:

> > I think getmail does better than fetchmail if the network goes down
> > while it's polling the server: fetchmail has an annoying habit of
> > losing its fetchids when this happens, resulting in the delivery of
> > several hundred duplicate messages in my case. However, I don't care
> > about that so much any more as I have a procmail script to filter out
> > duplicates (using an MD5 sum of the message contents rather than
> > trusting Message-IDs).
> 
> Would you share it? I would like such a thing.

The start of my .procmailrc is below. The in-line Perl script removes
all header fields except Date, From, Subject, To and Cc. MD5 sums are
appended to $MAILDIR/MD5, so you should remove the beginning of that
file from time to time. Duplicates are sent to =dupes. It's not
efficient or elegant, but it seems to work.

Edmund


SHELL=/bin/sh

MAILDIR=$HOME/Mail
LOGFILE=$MAILDIR/Log

md5cache=$MAILDIR/MD5

:0
md5sum=| perl -e 'while (($_ = <>) && !/^\r?\n$/) { if (/^[ \t]/) { next if $r; } else 
{ if (/^(date|from|subject|to|cc)\s*:/i) { $r = 0; } else { $r = 1; next } } 
s/\r?\n$/\n/; print; } print "\n"; while (<>) { s/\r?\n$/\n/; print; }' | md5sum | sed 
-e 's/ .*//'

:0:$md5cache.lock
* !? ! fgrep -q "$md5sum" "$md5cache" && echo "$md5sum" >> "$md5cache"
dupes

Reply via email to