On Sep 24, 2008, at 5:04 AM, Dino Korah wrote:
Hi all,
Could you please help me understand hos that works.
If I boost documents at index time based on some kind of criteria
and if I
am to sort on a different criteria at query time, how will the
result get
affected by the boosting.
So if I am to index a bunch of text files in a folder structure. If
I boost
individual documents based on the depth of the path. But when I
search on
the file content and if I sort on the modification time, how will
the result
be?
Index time boosting is used to say Document A is more important than
Document B. Thus, if all else being equal between the two, Doc A will
score higher. Sorting on other
values has no bearing on the scoring. Thus, if you sort by mod time,
you may see doc B higher in the list b/c of it's mod time, but it's
score would still be lower than A's.
One side note based on your example, below: Index time boosting does
not have much granularity (only 255 values), in other words, there is
a loss of precision. Thus, you
want to make sure your boosts are different enough such that you can
distinguish between the two. Maybe 1/(2*depth) or something like
that. You can alter how these 255 values are encoded, but that is
fairly advanced stuff.
HTH,
Grant
Eg: (boost is 1/depth)
Folder1/Folder2/Folder3/Folder4/Folder5/text_file1.txt,
mod-time:20080101000000, depth-factor: 1/5
Folder1/Folder2/text_file2.txt, mod-time:20080102000000, depth-
factor: 1/2
Folder1/Folder2/Folder3/Folder4/text_file3.txt, mod-time:
20080101000000,
depth-factor: 1/4
Folder1/Folder2/Folder3/Folder4/Folder5/text_file4.txt,
mod-time:20080105000000, depth-factor: 1/5
Many thanks.
Dino
--------------------------
Grant Ingersoll
http://www.lucidimagination.com
Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]