On Fri, Jun 02, 2006 at 03:47:10PM -0700, Chris Hostetter wrote: > You may want to check out the java-dev list ... there's been some talk > among the people who really unerstand the low levels of lucene's file > formats about adding arbitrary "payload" data with each term/doc pair .. a > proposal that started (as far as i can tell) from a desire to have > individual term/doc boosting...
It's funny that you don't seem to include yourself in that group yet, Hoss. I imagine it won't be long. If you haven't read that Brin/Page paper from 1998 yet, you should check it out. Enabling individual positions to be boosted is indeed one of the main targets of the current discussion. A slightly easier to understand application would be boosting individual tokens according to relative font size. For instance, we might assume that text between <h1> tags is more important than text between <p> tags and boost it. There's no good way to handle that right now. Marvin Humphrey Rectangular Research http://www.rectangular.com/ --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]