Re: FastSSFuzzy for faster fuzzy queries in Lucene

2009-01-05 Thread Robert Muir
hi, although i've been trying to get my code into shape to upload to jira (holidays got in the way a bit), I guess i think there might be some issues making my implementation work for general use. i based my design on certain assumptions, such as the fact I don't update indexes. once my index is

Re: FastSSFuzzy for faster fuzzy queries in Lucene

2009-01-05 Thread Grant Ingersoll
Do you have a reference paper/link on it? Sounds interesting. On Jan 5, 2009, at 8:17 PM, Jason Rutherglen wrote: Hello, I'm interested in getting FastSSFuzzy into Lucene, perhaps as a contrib module. One question is how much would the index grow? We've got a list of people's names we

Re: Exception when field sort.

2009-01-05 Thread 장용석
Thanks It's good for me. ありがとう. :-) - Jang 09. 1. 6, Koji Sekiguchi 님이 작성: > > That's correct! > > Koji > > > 장용석 wrote: > > Thanks for your advice. > > > > If I want to sort some field (for example name is "TITLE") and It must be > > Analyzed. > > > > Then Do I have to make two field that one i

Re: Exception when field sort.

2009-01-05 Thread Koji Sekiguchi
That's correct! Koji 장용석 wrote: > Thanks for your advice. > > If I want to sort some field (for example name is "TITLE") and It must be > Analyzed. > > Then Do I have to make two field that one is ANALYZED and the other is > NOT_ANALYZED like this? > > document.add(new Field("TITLE", value, Fiel

Re: Exception when field sort.

2009-01-05 Thread 장용석
Thanks for your advice. If I want to sort some field (for example name is "TITLE") and It must be Analyzed. Then Do I have to make two field that one is ANALYZED and the other is NOT_ANALYZED like this? document.add(new Field("TITLE", value, Field.Store.NO. Field.Index.ANALYZED) document.add(new

Re: Exception when field sort.

2009-01-05 Thread Koji Sekiguchi
See Sort class javadoc: http://hudson.zones.apache.org/hudson/job/Lucene-trunk/javadoc/org/apache/lucene/search/Sort.html It says: The fields used to determine sort order must be carefully chosen. Documents must contain a single term in such a field, and the value of the term should indicate the

FastSSFuzzy for faster fuzzy queries in Lucene

2009-01-05 Thread Jason Rutherglen
Hello, I'm interested in getting FastSSFuzzy into Lucene, perhaps as a contrib module. One question is how much would the index grow? We've got a list of people's names we want to do spellchecking on for example. -J

Exception when field sort.

2009-01-05 Thread 장용석
Hi. I want to test sorting when search so I was created simple index like this. String[] samples = {"duck dog","first dog","grammar dog","come dog","basic dog","intro dog","lipton dog","search dog","servlet dog","jan dog"}; Directory dir = FSDirectory.getDirectory(path); IndexWriter writer = new

Re: about TopFieldDocs

2009-01-05 Thread 장용석
Thanks for your help. It's really helpful for me. thanks very much. :-) -Jang. -- DEV용식 http://devyongsik.tistory.com

Re: IndexCommit#getFileNames() returning duplicates?

2009-01-05 Thread Michael McCandless
OK I opened & resolved LUCENE-1509 to fix this, for 2.9: https://issues.apache.org/jira/browse/LUCENE-1509 Mike Shalin Shekhar Mangar wrote: Hello, Solr uses IndexCommit#getFileNames() to get a list of files for replication. One windows user reported an exception which looks like it may

Re: ORs and Ranks

2009-01-05 Thread Erick Erickson
As you say, your "real" queries are more complex, but your example seems like a simple boost to me joined by an OR clause. MEDICAL:CAT^10 OR ANIMAL:CAT which you can construct in a BooleanQuery as two clauses and "SHOULD". The sense of this is that a hit must contain "CAT" in either the MEDICAL

ORs and Ranks

2009-01-05 Thread Walt Stoneburner
Got an interesting question about Lucene's behavior, as recently I was handed something that look like this: ( +MEDICAL CAT^2 ) OR ( +ANIMAL CAT^-2 ) The intention of the query is to say "if medical is found, then rank cat [scans] high, but if animal is found then rank cat [a feline] low." Pr

Re: Default and optimal use of RAMDirectory

2009-01-05 Thread Erick Erickson
In general from what I've seen on this list for the last couple of years, you're right. You're better off tweaking the various parameters of your IndexWriter (e.g. MaxBufferedDocs, MergeFactor, MergeDocs, etc.) than trying to use the blunt tool of RAMDirectory. Best Erick On Mon, Jan 5, 2009 at 1

Re: Default and optimal use of RAMDirectory

2009-01-05 Thread Ariel
Did you mean that the people that think the use of RAMDirectory is going to speed up the indexing proccess are wrong ??? On Sun, Dec 21, 2008 at 10:22 PM, Otis Gospodnetic < otis_gospodne...@yahoo.com> wrote: > Let me add to that that I clearly recall having a hard time getting the > tests for th

Re: about TopFieldDocs

2009-01-05 Thread Mark Miller
Erick Erickson wrote: > The number of documents > is irrelevant here, what is relevant is the number of > distinct terms in your "fieldName" field. > Depending on the size of your index, the number of docs will matter though. You have to store the unique terms in a String[] array, but you also s

Re: about TopFieldDocs

2009-01-05 Thread Erick Erickson
Mostly, the difference is in the sorting. Your example (1) scores by document relevance whereas your example (2) sorts by whatever is in fieldName. example (2), because it is sorting, will try to cache all the distinct *terms* in your index for that field, which is probably where your out of memo

Re: Re: about TopFieldDocs

2009-01-05 Thread tom
AUTOMATIC REPLY LUX is closed until 5th January 2009 - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: about TopFieldDocs

2009-01-05 Thread tom
AUTOMATIC REPLY LUX is closed until 5th January 2009 - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org

about TopFieldDocs

2009-01-05 Thread 장용석
Hi.. :) I have a simple question.. I have two sample code. 1) TopDocCollector collector = new TopDocCollector(5 * hitsPerPage); QueryParser parser = new QueryParser(fieldName, analyzer); query = parser.parse("keyword"); searcher.search(query, collector); ScoreDoc[] hits = collec