Re: Help interpreting explanation

2006-03-06 Thread Chris Hostetter
: on using Lucene but info for the internal workings of Lucene is hard to : come by. As with many OS code bases: the code is the documentation. : 1) I'm using the default QueryParser to parse and return a query so it's : a Boolean-OR query. So does this mean it uses the DisjunctionSumScorer : or

Re: Help interpreting explanation

2006-03-06 Thread Eugene
Thanks, Chris for your clear explanations, it seems there are a lot info on using Lucene but info for the internal workings of Lucene is hard to come by. I got some more questions which I'll ask in-line. Chris Hostetter wrote: : Since i'm using a boolean OR query i figured it must be related

Re: Help interpreting explanation

2006-03-06 Thread Chris Hostetter
: Since i'm using a boolean OR query i figured it must be related to the : BooleanScorer (though there's a more complicated BooleanScorer2 which : I'm not sure when it's use). There's actually three possible scorers used: ConjunctionScorer can be used if all of the clauses are required. Most of

Re: Help interpreting explanation

2006-03-06 Thread Eugene
Hi, Since i'm using a boolean OR query i figured it must be related to the BooleanScorer (though there's a more complicated BooleanScorer2 which I'm not sure when it's use). Looking at the BooleanScorer code it's probably a little over my head as I'm still a beginner to Lucene. But, I woul

Re: Help interpreting explanation

2006-03-05 Thread Chris Hostetter
: cosine similarity and need some help. Can anyone tell me in which file : are the methods of the DefaultSimilarity methods called? Most of the Similarity methods are called by the various Scorers. A good IDE will tell you where they are called (or you could just grep the source, that's what I

Re: Help interpreting explanation

2006-03-05 Thread Eugene
Thanks, for posting the "more like this" code. I just began coding my cosine similarity and need some help. Can anyone tell me in which file are the methods of the DefaultSimilarity methods called? For example, looking at the tf method i see that it takes in a float for freq instead of int.

Re: Help interpreting explanation

2006-03-05 Thread Eric Jain
Eugene wrote: Any good links on extending the similarity class? A lot of posts discusses David Spencer's "More Like This" but i can;t find this anywhere. The "More Like This" code can be found here: http://svn.apache.org/viewcvs.cgi/lucene/java/trunk/contrib/similarity/ --

Re: Help interpreting explanation

2006-03-05 Thread Eugene
I was wondering if anyone has any idea how i can start to implement my own similarity. I wanna use the cosine similarity measure instead. I was looking through the past forums posts and saw that quite a few people have also discussed this, but no real method of doing it was mentioned. Any good

Re: Help interpreting explanation

2006-03-03 Thread Chris Hostetter
: I was looking at the new 1.9 api and can't seem to find this expert mode : of searching. yonik's refering to all of the methods in the Searcher class that have "Expert" in their (javadoc) description. : http://lucene.apache.org/java/docs/api/org/apache/lucene/search/IndexSearcher.html#search(

Re: Help interpreting explanation

2006-03-03 Thread Eugene
I was looking at the new 1.9 api and can't seem to find this expert mode of searching. http://lucene.apache.org/java/docs/api/org/apache/lucene/search/IndexSearcher.html#search(org.apache.lucene.search.Weight,%20org.apache.lucene.search.Filter,%20org.apache.lucene.search.HitCollector) Can you te

Re: Help interpreting explanation

2006-03-03 Thread Yonik Seeley
On 3/3/06, Eugene <[EMAIL PROTECTED]> wrote: > Just one more question: Any way in which i can disable this normalization? We disabled this normalization for in Lucene 1.9 for the "expert" level search methods on IndexSearcher. Use the search methods that don't return Hits. -Yonik --

Re: Help interpreting explanation

2006-03-03 Thread Eugene
Ok, i figured out the normalization it was actually on an earlier post here: http://mail-archives.apache.org/mod_mbox/lucene-java-user/200601.mbox/[EMAIL PROTECTED] Just one more question: Any way in which i can disable this normalization? Thanks for all the help so far. -- Eugene Eugene wro

Re: Help interpreting explanation

2006-03-03 Thread Eugene
Hi, You mentioned: "The Hits class normalizes scores by dividing all scores by the highest score, if that highest score is above 1.0." Can you explain what highest score are we talking about? I think there's only one score for a query and doc right? Thanks Yonik Seeley wrote: On 3/3/06, Euge

Re: Help interpreting explanation

2006-03-03 Thread Yonik Seeley
On 3/3/06, Eugene <[EMAIL PROTECTED]> wrote: > Hi Yonik, > > Thanks a lot, I think i understand how explanation works better now. > > But, there's something weird I noticed. I've a query like: > "problem formulation each possible x probability p x y find x p x y > maximized how compute p x y" > > T

Re: Help interpreting explanation

2006-03-03 Thread Eugene
Hi Yonik, Thanks a lot, I think i understand how explanation works better now. But, there's something weird I noticed. I've a query like: "problem formulation each possible x probability p x y find x p x y maximized how compute p x y" The weird thing is that literals like "problem", "formulat

Re: Help interpreting explanation

2006-03-02 Thread Yonik Seeley
On 3/2/06, Eugene Ezekiel <[EMAIL PROTECTED]> wrote: > Thanks Yonik for the reply. I got just a couple more questions, > > 1) Why does the explanantion print so many times? Because it was a compound query with multiple parts to it. It's one explanation with multiple parts. >From the explain out

Re: Help interpreting explanation

2006-03-02 Thread Eugene Ezekiel
Thanks Yonik for the reply. I got just a couple more questions, 1) Why does the explanantion print so many times? 2) Since my query is made up of multiple terms how do I know what term "x" is referring to? On 3/3/06, Yonik Seeley <[EMAIL PROTECTED]> wrote: > > I think Lucene in Action does a

Re: Help interpreting explanation

2006-03-02 Thread Yonik Seeley
I think Lucene in Action does a good job of it. There is also a formula given in the javadoc for DefaultSimilarity http://lucene.apache.org/java/docs/api/org/apache/lucene/search/Similarity.html See my comments below (inline) On 3/2/06, Eugene <[EMAIL PROTECTED]> wrote: > Hi All, > > I'm not sure

Help interpreting explanation

2006-03-02 Thread Eugene
Hi All, I'm not sure how to interpret the result of the toString method of Explanation. I'm trying to see the values of each component of the Default Similarity formula for a particular query and a doc. Given below is a sample of my Explanation output. Many thanks if anyone could help expla