On Sep 24, 2008, at 5:04 AM, Dino Korah wrote:

Hi all,

Could you please help me understand hos that works.

If I boost documents at index time based on some kind of criteria and if I am to sort on a different criteria at query time, how will the result get
affected by the boosting.


So if I am to index a bunch of text files in a folder structure. If I boost individual documents based on the depth of the path. But when I search on the file content and if I sort on the modification time, how will the result
be?

Index time boosting is used to say Document A is more important than Document B. Thus, if all else being equal between the two, Doc A will score higher. Sorting on other values has no bearing on the scoring. Thus, if you sort by mod time, you may see doc B higher in the list b/c of it's mod time, but it's score would still be lower than A's.

One side note based on your example, below: Index time boosting does not have much granularity (only 255 values), in other words, there is a loss of precision. Thus, you want to make sure your boosts are different enough such that you can distinguish between the two. Maybe 1/(2*depth) or something like that. You can alter how these 255 values are encoded, but that is fairly advanced stuff.

HTH,
Grant




Eg: (boost is 1/depth)
Folder1/Folder2/Folder3/Folder4/Folder5/text_file1.txt,
mod-time:20080101000000, depth-factor: 1/5
Folder1/Folder2/text_file2.txt, mod-time:20080102000000, depth- factor: 1/2 Folder1/Folder2/Folder3/Folder4/text_file3.txt, mod-time: 20080101000000,
depth-factor: 1/4
Folder1/Folder2/Folder3/Folder4/Folder5/text_file4.txt,
mod-time:20080105000000, depth-factor: 1/5

Many thanks.
Dino




--------------------------
Grant Ingersoll
http://www.lucidimagination.com

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ








---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to