>>So I'm afraid I can't use the technique you recommend.

ah right - so the TermVector you use from the index will return mixed and lower case versions of the same text. One point to note - this would mean that of the 25 or so top terms selected by MoreLikeThis for querying there is a reasonable chance that you would be needlessly selecting both mixed and lower case versions of the same term. Given such duplication any match on a mixed case term will obviously also have a match on the lower-cased version so you are likely to have redundant query terms both needlessly slowing down your searches and limiting the variety of terms that make the cut into the top 25.

Cheers
Mark


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to