Nicolas Rachinsky <[EMAIL PROTECTED]>: > > I think getmail does better than fetchmail if the network goes down > > while it's polling the server: fetchmail has an annoying habit of > > losing its fetchids when this happens, resulting in the delivery of > > several hundred duplicate messages in my case. However, I don't care > > about that so much any more as I have a procmail script to filter out > > duplicates (using an MD5 sum of the message contents rather than > > trusting Message-IDs). > > Would you share it? I would like such a thing.
The start of my .procmailrc is below. The in-line Perl script removes all header fields except Date, From, Subject, To and Cc. MD5 sums are appended to $MAILDIR/MD5, so you should remove the beginning of that file from time to time. Duplicates are sent to =dupes. It's not efficient or elegant, but it seems to work. Edmund SHELL=/bin/sh MAILDIR=$HOME/Mail LOGFILE=$MAILDIR/Log md5cache=$MAILDIR/MD5 :0 md5sum=| perl -e 'while (($_ = <>) && !/^\r?\n$/) { if (/^[ \t]/) { next if $r; } else { if (/^(date|from|subject|to|cc)\s*:/i) { $r = 0; } else { $r = 1; next } } s/\r?\n$/\n/; print; } print "\n"; while (<>) { s/\r?\n$/\n/; print; }' | md5sum | sed -e 's/ .*//' :0:$md5cache.lock * !? ! fgrep -q "$md5sum" "$md5cache" && echo "$md5sum" >> "$md5cache" dupes