Re: [Dovecot] New full text search indexer

2007-11-14 Thread Timo Sirainen
On 14.11.2007, at 17.20, Adam McDougall wrote: Thanks for this list of steps, I've been intending to test it and was just about getting ready to ask the same question. Your email would be mice content to throw on the dovecot wiki under fts (currently empty) I wrote something there now. Al

Re: [Dovecot] New full text search indexer

2007-11-14 Thread Asheesh Laroia
On Wed, 14 Nov 2007, Adam McDougall wrote: Thanks for this list of steps, I've been intending to test it and was just about getting ready to ask the same question. Your email would be mice content to throw on the dovecot wiki under fts (currently empty) Good point, but you do it, it's bedtim

Re: [Dovecot] New full text search indexer

2007-11-14 Thread Adam McDougall
On Thu, Nov 15, 2007 at 12:07:15AM +0900, Asheesh Laroia wrote: On Wed, 14 Nov 2007, Daniel Watts wrote: >> Timo - we were just having a conversation about how we might be able to >> provide full body indexed search for our clients and I realised it might >> be worth checking the Dovecot li

Re: [Dovecot] New full text search indexer

2007-11-14 Thread Asheesh Laroia
On Wed, 14 Nov 2007, Daniel Watts wrote: Timo - we were just having a conversation about how we might be able to provide full body indexed search for our clients and I realised it might be worth checking the Dovecot list to see if this has been done already... And then I find this thread! W

Re: [Dovecot] New full text search indexer

2007-11-14 Thread Daniel Watts
Timo Sirainen wrote: On Fri, 2007-04-06 at 00:34 +0300, Timo Sirainen wrote: Squat-like 4 byte substrings (but can answer 1-3 char queries also): Indexing a 1,4GB Linux kernel mailing list mbox with 367919 messages: UID count: 367919 Index time: 129.86 CPU seconds (10.43MB/CPUs), 132.47 secon

Re: [Dovecot] New full text search indexer

2007-04-13 Thread Ben Winslow
On Fri, 13 Apr 2007 14:07:58 +0300 Timo Sirainen <[EMAIL PROTECTED]> wrote: > gzip compression makes the uidlist still 25% smaller (total space > 19,50%). It'd have to be used to compress the file in smaller blocks > because zlib doesn't support quickly seeking inside the file. That would > probab

Re: [Dovecot] New full text search indexer

2007-04-13 Thread Timo Sirainen
On Fri, 2007-04-06 at 00:34 +0300, Timo Sirainen wrote: > Squat-like 4 byte substrings (but can answer 1-3 char queries also): Indexing a 1,4GB Linux kernel mailing list mbox with 367919 messages: UID count: 367919 Index time: 129.86 CPU seconds (10.43MB/CPUs), 132.47 seconds (10.23MB/s) Memory:

Re: [Dovecot] New full text search indexer

2007-04-06 Thread Charles Marcus
Timo Sirainen wrote: As described earlier (http://dovecot.org/list/dovecot/2006-December/018055.html), Dovecot nowadays has full text search indexing support in CVS HEAD. So it takes somewhat more space, but definitely less than having both Squat + Lucene. No substring indexing, words up

Re: [Dovecot] New full text search indexer

2007-04-06 Thread Timo Sirainen
On Fri, 2007-04-06 at 10:42 +0200, DINH Viêt Hoà wrote: > An other problem with squat is that we can't remove items from the > index. (the version of Cyrus). Is that still the case ? No. Dovecot's Squat is almost completely different from Cyrus. I just kept the name because the basic ideas are the

Re: [Dovecot] New full text search indexer

2007-04-06 Thread DINH Viêt Hoà
On 4/5/07, Timo Sirainen <[EMAIL PROTECTED]> wrote: As described earlier (http://dovecot.org/list/dovecot/2006-December/018055.html), Dovecot nowadays has full text search indexing support in CVS HEAD. Currently there are two backends: Lucene and Squat. Lucene's problem is that standard IMAP SEA

[Dovecot] New full text search indexer

2007-04-05 Thread Timo Sirainen
As described earlier (http://dovecot.org/list/dovecot/2006-December/018055.html), Dovecot nowadays has full text search indexing support in CVS HEAD. Currently there are two backends: Lucene and Squat. Lucene's problem is that standard IMAP SEARCH command can't be used with it without breaking IMA