> I am futzing around with Andrew Stuarts "Catchmail" program > that stores emails into a postgresql database. > > I want to avoid inserting the same email more than once... > (pieces of the email actually get emplaced into several > tables). > > Is the "Message-ID" header field a globally unique identifer?
I think you're looking for RFC 2822 (http://www.faqs.org/rfcs/rfc2822.html). I seem to recall that one of the rfc's listed a time limit of two years for uniqueness, though I'm at a loss to find which one at the moment. Pertinent sections: 3.6.4. Identification fields Though optional, every message SHOULD have a "Message-ID:" field. <snip> The message identifier (msg-id) itself MUST be a globally unique identifier for a message. The generator of the message identifier MUST guarantee that the msg-id is unique. There are several algorithms that can be used to accomplish this. Since the msg-id has a similar syntax to angle-addr (identical except that comments and folding white space are not allowed), a good method is to put the domain name (or a domain literal IP address) of the host on which the message identifier was created on the right hand side of the "@", and put a combination of the current absolute date and time along with some other currently unique (perhaps sequential) identifier available on the system (for example, a process id number) on the left hand side. Using a date on the left hand side and a domain name or domain literal on the right hand side makes it possible to guarantee uniqueness since no two hosts use the same domain name or IP address at the same time. Though other algorithms will work, it is RECOMMENDED that the right hand side contain some domain identifier (either of the host itself or otherwise) such that the generator of the message identifier can guarantee the uniqueness of the left hand side within the scope of that domain. Regards, Bruce Ritchie ---------------------------(end of broadcast)--------------------------- TIP 4: Don't 'kill -9' the postmaster