asking about incremental update

2010-08-18 Thread Yakob
hello all, you may remember me as the one who ask about how to understand lucene in the previous email,but I have now been able to create a sample application of lucene. I read the book and able to test it. which to me is very great, as I am a new learner. here is my proof. http://jacobian.web.id

Re: cluster documents based on fields' values

2010-08-18 Thread Stanislaw Osinski
> > A colleague of mine also discovered solr's clustering component - > http://wiki.apache.org/solr/ClusteringComponent. It's still labeled as > experimental - does anybody have experience with it? > The clustering component is based on the Carrot2 project ( project.carrot2.org). Carrot2 has been

RE: Sorting a Lucene index

2010-08-18 Thread Shelly_Singh
Hi Anshum, I require sorted results for all my queries and the field on which I need sorting is fixed; so this lead to me the idea of storing in sorted order to avoid sorting cost with every query. Thanks and Regards, Shelly Singh Center For KNowledge Driven Information Systems, Infosys Email:

Re: Solr SynonymFilter in Lucene analyzer

2010-08-18 Thread Lance Norskog
Yes, you need an analyzer that leaves successive words together as one long term. This might be easier to do with the new CharFilter tool, which processes text before it goes to the tokenizer. What you are doing here is similar to Parts-Of-Speech analysis, where text analysis software parses a sen

Re: Solr SynonymFilter in Lucene analyzer

2010-08-18 Thread Arun Rangarajan
I think the lucene WhitespaceAnalyzer I am using inside Solr's SynonymFilter is the one that prevents multi-word synonyms like "New York" from getting mapped to the generic synonym name like CONCEPTYcity. It appears to me that an analyzer which recognizes that a white-space is inside a synonym like

Problems with Lucene 3.0.2 and Java 1.6.0_12

2010-08-18 Thread Nader, John P
This is a follow up related to my original post Term browsing performance problems with our upgrade to Lucene 3.0.2. The suggestions were helpful and did give us a performance increase. However, in a full scale environment under load our performance issue remained a problem. Our investig

Re: "Natural sorting" of documents in a Lucene index - possible?

2010-08-18 Thread Ian Lea
> Can you guys tell me more about "warm up queries" strategies ? > > I know that once you made one query, the second time is super quick because > it's in cache - but how can you do warm up queries when you don't know what > users are going to search ? It's not so much that the hits or queries are

Re: "Natural sorting" of documents in a Lucene index - possible?

2010-08-18 Thread Michel Nadeau
Can you guys tell me more about "warm up queries" strategies ? I know that once you made one query, the second time is super quick because it's in cache - but how can you do warm up queries when you don't know what users are going to search ? - Mike aka...@gmail.com On Wed, Aug 18, 2010 at 11:2

Re: "Natural sorting" of documents in a Lucene index - possible?

2010-08-18 Thread Michel Nadeau
Thanks ! - Mike aka...@gmail.com On Wed, Aug 18, 2010 at 10:37 AM, Ian Lea wrote: > > But - to come back to my original question... is there any way to have a > > "natural order" of documents other that the DocId In Lucene? > > No. > > > -- > Ian. > > > On Wed, Aug 18, 2010 at 3:21 PM, Michel

Re: "Natural sorting" of documents in a Lucene index - possible?

2010-08-18 Thread Ian Lea
> But - to come back to my original question... is there any way to have a > "natural order" of documents other that the DocId In Lucene? No. -- Ian. On Wed, Aug 18, 2010 at 3:21 PM, Michel Nadeau wrote: > Cool, so I'll try these things - > > * Replace timestamps with MMDD - will minimize

Re: "Natural sorting" of documents in a Lucene index - possible?

2010-08-18 Thread Michel Nadeau
Cool, so I'll try these things - * Replace timestamps with MMDD - will minimize unique terms count; * Use NumericField's for dates and numbers - will remove all string sorting. Thanks guys! -- But - to come back to my original question... is there any way to have a "natural order" of documen

Re: TermQuery and ConstantScoreQuery on TermsFilter

2010-08-18 Thread Ian Lea
Hard to say - there are many factors involved in searching. I'd just use the easiest queries that were fast enough. If you want a better answer more info would be useful. For starters: What version of lucene. How big is the index. How many hits. Exactly what do the queries look like (q.toString

Re: Sorting a Lucene index

2010-08-18 Thread Anshum
Hi Shelly, The search results so returned are sorted either by relevance, index order, stored field, or custom order. As you are saying that you would not be able to maintain the index order, you would have to do the sort at run time. Sorting on a stored field is not costly and you may use it comf

TermQuery and ConstantScoreQuery on TermsFilter

2010-08-18 Thread Shelly_Singh
Hi, In my index lucene index, I want to search on a field, but the score or order of returned documents is not important. What is important is which documents are returned. As, I do not need score or even default sorting(order by docid), what is the best way to write a query. I compared perf

Sorting a Lucene index

2010-08-18 Thread Shelly_Singh
Hi, I have a Lucene index that contains a numeric field along with certain other fields. The order of incoming documents is random and un-predictable. As a result, while creating an index, I end up adding docs in random order with respect to the numeric field value. For example, documents may