On Friday 28 April 2006 23:09, Daniel Shane wrote: > Hi! > > [I'm sorry for also posting this on the dev mailing list, but I was not sure > in which one it would be best, so if there is a moderator, please kill > either one.] > > I'm planning on contributing to Lucene by adding a new kind of query. I dont > know how to call it yet, but it would be a mix of BooleanQuery and > ExactPhraseQuery. > > I would like to have a Query that is a BooleanQuery, but with a slight touch > where it would boost results if it finds the query terms in a an exact > phrase. > > For example, if I have terms A, B and C and I do a simple boolean search : A > B C, I would like to have a query that behaves a bit like if I rewrote this > query as such : > > +A +B +C "A B" "B C" "A B C" > > This would boost results where the exact string "A B C" or any substring > like "A B" or "B C" are found. > > Of course I could rewrite all the queries, but it takes way too long to > search which this algorithm. I wanted to know if anyone has any ideas in > what direction I should go, or if its easy or not to implement this idea by > modifying or extending some already existing Query classes. > > I'm fairly new to Lucene although I know a bit about search engines, idf, > etc... but I've tried to understand BooleanQuery and ExactPhraseQuery to see > how I could modify them and I'm having a bit of a problem understanding it > all on my own I guess. > > Any help or comments would be appreciated, and if it works well I do think > it would be a good addition to the Lucene code base (I think this query > should be used as a default in the QueryParser if it works ok instead of a > simple BooleanQuery).
Have a look in the archives at the thread "Lucene performance bottlenecks" that started on 2 Dec 2005. There are some recommendations there on how to implement a Scorer for such a query. Regards, Paul Elschot --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]