Can you share your query generation code? Your description doesn't make sense to me and I wonder how you are creating and running the searches.

Can you run the explain() method on your documents?

Also, FWIW, it sounds like you are prematurely optimizing. For every query you ever do in your entire life (or until search engines are capable of reading your mind, and even then, I'm not sure) on any search system, you will be able to find at least one result that you think is better than one that is ranked higher. Now, if you told me you had 50 queries that all scored badly, that is different.


On Aug 28, 2008, at 1:37 PM, Shi Hui Liu wrote:

Hi Grant,

Thank you for your help. My query is A AND B. The problem is if I use BooleanQuery, I got score 120 from TermQuery(A) and 0.5 from TermQuery(B) for the first article; for second article, I got score 27 from TermQuery(A) and 36 from TermQuery(B). From my point of view, I think the second article is better than the first one, however, the result is otherwise. I'm thinking to create a Query to multiply the score of A and B instead of sum. But I want to know if there is an existing query doing same thing since I'm still a new user to Lucene.

Thank you in advance,
Shi Hui

-----Original Message-----
From: Grant Ingersoll [mailto:[EMAIL PROTECTED]
Sent: Thursday, August 28, 2008 5:57 AM
To: java-user@lucene.apache.org
Subject: Re: Clarity: Is there a Query boosting 50-50 over 1000-1 ?


On Aug 27, 2008, at 7:34 PM, Shi Hui Liu wrote:

Hi,

I think I should clarify my question a little bit. I'm using
BooleanQuery to combine TermQuery(A) and TermQuery(B). But I'm not
satisfied with its scoring algorigthm. Is there other queries can
boost up the documents with 50 of A and 50 of B on top of documents
with 1000 of A and 1 of B?

Is your query A + B meant to be A OR B or A AND B?  That is, are both
terms required?  You notation suggests they are, but the description
suggests you are getting documents that have only A in them, which
suggests "OR".

Have you looked at the explains?  What about the scoring aren't you
happy with?  It's not perfect (there is no such thing) but it works
pretty well in most cases, and works great if you spend a little time
figuring out the right length normalization factors.


And I'm looking at the source code and found lots of classes are not
public and some important methods are protected. What's the reason?
Why make them public and let users to customize the Query easily?

Because there not meant to be overridden, but of course we are open to
specific suggestions on things that should be made public and often do
this when someone shows a valid reason.

Cheers,
Grant

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


--------------------------
Grant Ingersoll
http://www.lucidimagination.com

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ








---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to