Re: Re[4]: Frequently updated fields

2008-09-17 Thread Jason Rutherglen
Hi Wojciech, Integration with SOLR would be ideal. However that would take more time. It depends on the exact features. There is at least one patch to IndexWriter. The merging is the part that needs to be synchronized and this is where I am hesitant because Ocean/realtime search performs merge

Re[4]: Frequently updated fields

2008-09-17 Thread Wojciech Strzałka
I'll ask my boss. I don't expect it to be soon, but when SOLR will be implemented, we will see how it works for us and maybe we will want the feature, as we already have very good experience in sponsoring Open Source features. Can you tell me how big the project is? Are we talking about

Re: Re[2]: Frequently updated fields

2008-09-16 Thread Jason Rutherglen
Hi Wojciech, The code isn't ready, it is a major project and I am trying to also complete the realtime indexing patches and look for a job. I believe that the tag indexing stuff is of interest to many people so if there is someone who can pay to get it completed feel free to contact me as I am av

Re[2]: Frequently updated fields

2008-09-16 Thread Wojciech Strzałka
I saw your comments on JIRA. You mentioned about rework and I'm wondering if the currently available patch is production ready (functionally complete)? Will the code after rework work with the index build with the current version? I'm quite new to SOLR/Lucene but I hope I could write custom

Re: Frequently updated fields

2008-09-14 Thread Jason Rutherglen
It would be good to allow users to use their own Filter subclasses in SOLR. This will help with RMI based implementations that use SOLR, and will allow all of the open source Filter work to be used in SOLR, without needing to recreate it with DocSets. 2008/9/14 Gerardo Segura <[EMAIL PROTECTED]>:

Re: Frequently updated fields

2008-09-14 Thread Gerardo Segura
I had similar requirements: some fields didn't required text processing, there were just used as filters to focus the search on subset of documents in solr. As Karl suggested, implementing a filter was the most direct approach for me. The issue was that, not been familiar myself with solr, I c

Re[2]: Frequently updated fields

2008-09-13 Thread Wojciech Strzałka
My strong reqirement is that search server runs on different machine then client - so I think I have two options: SOLR or Lucene via RMI (RemoteSearchable) So by now it looks like I have several options: 1. TagIndex - like described here http://issues.apache.org/jira/browse/

Re: Frequently updated fields

2008-09-12 Thread Jason Rutherglen
Yes Tag Index will work. I have not had time to complete it however if you are interested in working on it please feel free to contact me. On Fri, Sep 12, 2008 at 3:48 PM, Mark Miller <[EMAIL PROTECTED]> wrote: > You might check out the tagindex issue in jira as well. Havn't looked at it > myself

Re: Frequently updated fields

2008-09-12 Thread Mark Miller
You might check out the tagindex issue in jira as well. Havn't looked at it myself, but I believe its supposed to be an option for this. Gerardo Segura wrote: I think the important question is: in general how to cope with frequently changing fields. Karl Wettin wrote: Hi Wojciech, can you

Re: Frequently updated fields

2008-09-12 Thread Karl Wettin
There is no single easy answer to the question. There are a number of solutions to the problem, in this thread we've so far listed the following: reindex document in single index, using parallell indices and filters created from the source data. There are other things one can do too, but wh

Re: Frequently updated fields

2008-09-12 Thread Gerardo Segura
I think the important question is: in general how to cope with frequently changing fields. Karl Wettin wrote: Hi Wojciech, can you please give us a bit more specific information about the meta data fields that will change? I would recommend you looking at creating filters from your primary

Re: Re[2]: Frequently updated fields

2008-09-12 Thread Erick Erickson
If you search the archive, this very topic has been discussed many times. You'e find a wealth of discussion and more than a few options outlined there Best Erick 2008/9/12 Wojciech Strzałka <[EMAIL PROTECTED]> > > The most changing fields will be I think: > Status (read/unread): in fact I'm af

Re: Frequently updated fields

2008-09-12 Thread Karl Wettin
12 sep 2008 kl. 14.51 skrev Wojciech Strzałka: The most changing fields will be I think: Status (read/unread): in fact I'm affraid of this at most - any mail incoming to the system will need to be indexed at least twice This is why I recommended you to use a filte

Re[2]: Frequently updated fields

2008-09-12 Thread Wojciech Strzałka
The most changing fields will be I think: Status (read/unread): in fact I'm affraid of this at most - any mail incoming to the system will need to be indexed at least twice Flags: 0..n values from enum Tags:0..n values from enum Of course all the other field

Re[2]: Frequently updated fields

2008-09-12 Thread Wojciech Strzałka
@lucene.apache.org >> Subject: Frequently updated fields >> >> Hi. >> >>I'm new to Lucene and I would like to get a few answers (they can >>be lame) >> >>I want to index large amount of emails using Lucene (maybe >> SOLR), not only

Re: Frequently updated fields

2008-09-12 Thread Karl Wettin
Hi Wojciech, can you please give us a bit more specific information about the meta data fields that will change? I would recommend you looking at creating filters from your primary persistency for query clauses such as unread/read, mailbox folders, et c. karl 12 sep 2008 kl. 13.57

RE: Frequently updated fields

2008-09-12 Thread Jimi Hullegård
> -Original Message- > From: Wojciech Strzałka [mailto:[EMAIL PROTECTED] > Sent: den 12 september 2008 13:58 > To: java-user@lucene.apache.org > Subject: Frequently updated fields > > Hi. > >I'm new to Lucene and I would like to get a few answers (they

Frequently updated fields

2008-09-12 Thread Wojciech Strzałka
Hi. I'm new to Lucene and I would like to get a few answers (they can be lame) I want to index large amount of emails using Lucene (maybe SOLR), not only the contents but also some metadata like state or flags. The problem is that the metadata will change during mail lifecycle,