Re: interpreting scores

2009-05-07 Thread Nate
Hi Karl, No, sometimes there will not be a matching MP3 for a note file. When this happens, the results I get are very poor. For example, if a song with a common song word like "love" in the name does not have a matching note file, then I get a handful of results that contain the word "love" but a

Re: I got the score "0.3044460713863373" for the cosine similarity of two document with the same text content !!

2009-05-07 Thread Grant Ingersoll
What does the searcher.explain() method say? -Grant On May 6, 2009, at 2:18 AM, Kamal Najib wrote: hi, thanks for the reply.see: http://lucene.apache.org/java/2_4_1/api/index.html you will find there the Similarity have created and run to get the similarity between the two Strings.I did the

Re: get term neighbours

2009-05-07 Thread Grant Ingersoll
On May 7, 2009, at 9:11 AM, Adrian Dimulescu wrote: Thank you for these precisions. As I had to do something fast, I coded the thing as illustrated by the following pseudocode: IndexReader index; TermPositions iterator = this.index.termPositions(t); // for each doc where this term appe

Re: Lucene Index Encryption

2009-05-07 Thread Peter_Lenahan
Michael, Thanks for the comments they are very insightful. I hadn't thought about the Random Access issues until you brought it up. This makes the project a little tougher, but not impossible. I was searching last night and there have been a couple of papers written on the topic of Encrypted

Re: why setPhraseSlop() not helping

2009-05-07 Thread Erick Erickson
You haven't forced the double quotes through to the parser. Try Query query = qp.parse("\"word1 word2\""); On Thu, May 7, 2009 at 11:14 AM, Seid Mohammed wrote: > I have set the slop for my search to be some terms away for inclusion. > unfortunately, the result is the same indpendent of my setPh

Re: get term neighbours

2009-05-07 Thread Adrian Dimulescu
Thank you for these precisions. As I had to do something fast, I coded the thing as illustrated by the following pseudocode: IndexReader index; TermPositions iterator = this.index.termPositions(t); // for each doc where this term appears while (iterator.next()) { int docNr = it

why setPhraseSlop() not helping

2009-05-07 Thread Seid Mohammed
I have set the slop for my search to be some terms away for inclusion. unfortunately, the result is the same indpendent of my setPhraseSlop(int) usage. code excerpts: == QueryParser qp = new QueryParser("content", new AmharicAnalyzer()); qp.setPhraseSlop(3);

Re: interpreting scores

2009-05-07 Thread Karl Wettin
Nate, will there always be a correspodning mp3 for any given note sheet? As for analysis, I'd try using ngrams of the complete untokenized file name if I was you. "Michael Jackson Don't Stop 'till You Get Enough" -> "^mic", "mich", "icha", "chae", "hael", "ael ", "el j", "l ja", and so on

Re: TermEnum with deleted dccuments

2009-05-07 Thread Michael McCandless
This is known & expected. Lucene does not update the terms dictionary (meaning which terms are in the index, and their frequency) in response to deleted docs. It does update TermDocs enumeration, ie once you get the TermDocs for a given term and step through its docs, the deleted docs will not be