Re: Best Practice: emails and file-attachments

2006-08-16 Thread John Haxby
Oh rats. Thunderbird ate the indenting. The two examples should be: multipart/alternative text/plain multipart/related text/html image/gif image/gif application/msword and multipart/related text/html image/

Re: Best Practice: emails and file-attachments

2006-08-16 Thread John Haxby
lude wrote: You also mentioned indexing each bodypart ("attachment") separately. Why? To my mind, there is no use case where it makes sense to search a particular bodypart I will give you the use case: [snip] 3.) The result list would show this: 1. mail-1 'subject' 'Abstract of the messa

Re: Best Practice: emails and file-attachments

2006-08-16 Thread lude
Hi Johan, thanks again for the many words and explanations! You also mentioned indexing each bodypart ("attachment") separately. Why? To my mind, there is no use case where it makes sense to search a particular bodypart I will give you the use case: 1.) User searches for "abcd" 2.) Luc

Re: Best Practice: emails and file-attachments

2006-08-16 Thread John Haxby
lude wrote: Hi John, thanks for the detailed answer. You wrote: If you're indexing a multipart/alternative bodypart then index all the MIME headers, but only index the content of the *first* bodypart. Does this mean you index just the first file-attachment? What do you advice, if you have to

Re: Best Practice: emails and file-attachments

2006-08-16 Thread lude
Hi Dejan, how do you query for email- and(!) attachment-documents, if you just want to present one hit per email (even if the searchterm matches in the email- and(!) in the corresponding attachment-document)? Thanks lude On 8/15/06, Dejan Nenov <[EMAIL PROTECTED]> wrote: The approach we I fi

Re: Best Practice: emails and file-attachments

2006-08-16 Thread lude
Hi John, thanks for the detailed answer. You wrote: If you're indexing a multipart/alternative bodypart then index all the MIME headers, but only index the content of the *first* bodypart. Does this mean you index just the first file-attachment? What do you advice, if you have to index mulitp

RE: Best Practice: emails and file-attachments

2006-08-15 Thread Dejan Nenov
The approach we I find best is to create both Email documents - where a list (and links) to all attachments is contained as well as individual Attachment documents. It gets a little tricky when you have a forwarded email, containing an original Email that contains a tar.gz attachment, which contai

Re: Best Practice: emails and file-attachments

2006-08-15 Thread John Haxby
lude wrote: does anybody has an idea what is the best design approch for realizing the following: The goal is to index emails and their corresponding file attachments. One email could contain for example: I put a fair amount of thought into this when I was doing the design for our mail server -