Re: Grouping on multiple shards possible in lucene?

2012-11-21 Thread Shai Erera
If you are only interested in doc addition sorting, then it should be easy to reverse the doc orders in each segment, using something like IndexSorter. Shai On Wed, Nov 21, 2012 at 8:03 AM, Ravikumar Govindarajan < ravikumar.govindara...@gmail.com> wrote: > Hi Shai, > > I would only want to sort

Re: Grouping on multiple shards possible in lucene?

2012-11-21 Thread Ravikumar Govindarajan
Yeah, but IndexSorter is offline. I need an online sorter. The trouble is as Mike pointed out, the delta encodings are forward only. I do not know of an available encoding to do this. -- Ravi On Wed, Nov 21, 2012 at 3:26 PM, Shai Erera wrote: > If you are only interested in doc addition sorting

Re: Which stemmer?

2012-11-21 Thread Elmer van Chastelet
I've just created a small web application which you might find useful. You can see which words are matched by a query word when using different analyzers (phonetic and stemming analyzers). These include snowball, kstem and minimal stem (the ones on the right). http://dutieq.st.ewi.tudelft.nl/w

Potential Resource Leak warning in Analyer.createComponents()

2012-11-21 Thread Carsten Schnober
Hi, I use a custom analyzer and tokenizer. The analyzer is very basic and it merely comprises the method createComponents(): - @Override protected TokenStreamComponents createComponents(String fieldName, Reader reader) { return new Toke

RE: Potential Resource Leak warning in Analyer.createComponents()

2012-11-21 Thread Uwe Schindler
Disable this warning, your workaround is worse than the warning. - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Carsten Schnober [mailto:schno...@ids-mannheim.de] > Sent: Wednesday, November 21, 2012 2:

Question about ordering rule of SpanNearQuery

2012-11-21 Thread 杨光
Hi all, Recently, we are developing a platform with lucene. The ordering rule we specified is the document with the shortest distance between query terms ranks the first. But there may be a little different with SpanNearQuery. It returns all the documents with qualified distance. So I am con

Re: Question about ordering rule of SpanNearQuery

2012-11-21 Thread Jack Krupansky
Add &debugQuery=true to your query and look at the "explain" section to see how the scoring is calculated for each document. Sometimes it is counter-intuitive and some factors may differ but those differences can be overwhelmed by other, unrelated factors. -- Jack Krupansky -Original Mess

Re: Question about ordering rule of SpanNearQuery

2012-11-21 Thread Jack Krupansky
Oops... sorry, I just noticed that you are a Lucene, not Solr, user. Call the IndexSearcher#explain method to get the explanation and call the toString method on the explanation to see the readable text. http://lucene.apache.org/core/4_0_0/core/org/apache/lucene/search/IndexSearcher.html#explai

Re: Which stemmer?

2012-11-21 Thread Jack Krupansky
Great! For my favorite example of "invest", "invests", etc. it shows: SnowballEnglish: •investment •invest •invests •investing •invested kStem: •investors •invest •investor •invests •investing •invested minimalStem:invest •invest •invests That highlights the distinctions between these stemmers

Re: Question about ordering rule of SpanNearQuery

2012-11-21 Thread Chris Hostetter
: I am confused with the ordering rule about SpanNearQuery. For example, I : indicate the slot in SpanNearQuery is 10. And the results are all the : qualified documents. Is it true that any document with shorter distance ... : it till uses tf-idf algorithm to rank the docs. Or there is

Re: Multiple facets in Lucene searches

2012-11-21 Thread Shai Erera
Hi Jan, Basically, DrillDown is a helper class for creating such queries. You're right that its query() methods create AND, because that's normally the case, but if you require OR, you could do this: BooleanQuery res = new BooleanQuery(); for (CategoryPath cp : paths) { res.add(new