Re: [Dovecot] FTS Plugin design

Rui Carneiro Fri, 15 May 2009 09:48:36 -0700

Citando Timo Sirainen <t...@iki.fi>:
> 1. You notice a non-text/* content-type and initialize text extraction
> for the MIME part. Like:
> 
> struct attachment_extract_context *
> attachment_extract_init(const char *content_type);
> 
> 2. After this you feed all the input belonging to that MIME part to:
> 
> int attachment_extract_add(struct attachment_extract_context *ctx,
> const struct message_block *input);
> 
> Don't output anything to FTS backend at this point. The
> attachment_extract_add() would probably just basically write to a
> temporary file.
> 
> 3. Finally you'll notice that the MIME part ends (either you get headers
> for the next MIME part or the entire message ends). Then finish the
> extraction, which actually executes the whatever conversion binaries:
> 
> int attachment_extract_finish(struct attachment_extract_context *ctx);
> 
> 4. Get the resulting text to fts_backend_build_more() somehow. Either
> some attachment_extract_add_to_fts() which internally adds it or some
> kind of an iterator that returns the text in smaller blocks. Either
> would work..
> 
> That kind of an API would also make it possible to pretty easily modify
> in future to not write temporary files for specific content types if
> it's not required.
>


I tried your approach and I think it is working pretty well. Now I only need to 
look carefully to the output of external programs and build the XML correctly 
to send to Solr.

Thanks Timo

Regards,
Rui Carneiro

-- 
Portugalmail, Comunicações S.A.
www.portugalmail.net

Re: [Dovecot] FTS Plugin design

Reply via email to