Re: custom similarity based on tf but greater than 1.0

2007-01-23 Thread Vagelis Kotsonis
So the normalization was made through Hits. That was something I didn't understand. If I was alone I would search in Scorer and query classes. Thank you for this. Finally I used the following: final HitQueue hq = new HitQueue(results.length()); searcher.search(qr, new HitCollector

Re: custom similarity based on tf but greater than 1.0

2007-01-23 Thread Vagelis Kotsonis
ECTED]> > To: java-user@lucene.apache.org > Sent: Thursday, January 18, 2007 5:36:21 PM > Subject: Re: custom similarity based on tf but greater than 1.0 > > I just did the same thing. If you search the list you'll find the thread > where Hoss gave me the info you n

Re: custom similarity based on tf but greater than 1.0

2007-01-19 Thread Otis Gospodnetic
: java-user@lucene.apache.org Sent: Thursday, January 18, 2007 5:36:21 PM Subject: Re: custom similarity based on tf but greater than 1.0 I just did the same thing. If you search the list you'll find the thread where Hoss gave me the info you need. It really comes down to makeing a FakeNormsIndexRea

Re: custom similarity based on tf but greater than 1.0

2007-01-18 Thread Vagelis Kotsonis
It is 4 in the morning here in Greece, so I will try it tomorrow...sometime I must sleep! I will come up with the results tomorrow. Thanks! Vagelis markrmiller wrote: > > A...I brushed over your example too fast...looked like normal > counting to me...I see now what you mean. So OMIT_NORM

Re: custom similarity based on tf but greater than 1.0

2007-01-18 Thread Mark Miller
A...I brushed over your example too fast...looked like normal counting to me...I see now what you mean. So OMIT_NORMS probably did work. Are you getting the results through hits? Hits will normalize. Use topdocs or a hitcollector. - Mark Vagelis Kotsonis wrote: But i don't want to get th

Re: custom similarity based on tf but greater than 1.0

2007-01-18 Thread Vagelis Kotsonis
But i don't want to get the frequency of each term in the doc. what I want is 1 if the term exists in the doc and 0 if it doesn't. After this, I want all thes 1s and 0s to be summed and give me a number to use as a score. If I set the TF value as 1 or 0, as I described above, I get the right num

Re: custom similarity based on tf but greater than 1.0

2007-01-18 Thread Mark Miller
Dont return 1 for tf...just return the tf straight with no changes...return freq. For everything else return 1. After that OMIT_NORMS should work. If you want to try a custom reader: public class FakeNormsIndexReader extends FilterIndexReader { byte[] ones = SegmentReader.createFakeNorms(max

Re: custom similarity based on tf but greater than 1.0

2007-01-18 Thread Vagelis Kotsonis
I feel kind of stupid...I don't get what hossman says in his post. I got the thing abou the OMMIT_NORMS and I tried to do it by calling Field.setOmitNorms(true); before adding a field in the index. After that I re-indexed my collection but still not making any difference. Tell me if I got it rig

Re: custom similarity based on tf but greater than 1.0

2007-01-18 Thread Mark Miller
Sorry your having trouble find it! Allow me...bingo: http://www.gossamer-threads.com/lists/lucene/java-user/43251?search_string=sorting%20by%20per%20doc%20hit;#43251 Prob doesn't have great keyword for finding it. That should get you going though. Let me know if you have any questions. - Mark

Re: custom similarity based on tf but greater than 1.0

2007-01-18 Thread Vagelis Kotsonis
Before I make this questions I have been looking the list for over 2 hours and I didn't find something to make me understand how to do what I want. After you sent the message I made a quick pass through all your messages, but I didn't find something. I also searched for FakeNormsIndexReader and s

Re: custom similarity based on tf but greater than 1.0

2007-01-18 Thread Mark Miller
I just did the same thing. If you search the list you'll find the thread where Hoss gave me the info you need. It really comes down to makeing a FakeNormsIndexReader. The problem you are having is a result of the field size normalization. - mark Vagelis Kotsonis wrote: Hi all. I am trying to