On Wednesday 06 April 2005 08:19, Chuck Williams wrote: > Erik Hatcher writes (4/5/2005 5:57 PM): > > > I have a need to implement wildcarded phrase queries, such as this: > > > > "apach? luc*" > > > > which would match "apache lucene", for example. This needs to also > > support ordered and unordered proximity like SpanNearQuery does: > > > > "apach? luc*"~10 > > > > I presume I'm going to have to key off of SpanQuery with a some > > specialized subclasses. > > > > What approach do you recommend for implementing something like this? > > Hi Erik, > > Might it be as easy as creating a SpanWilcardQuery that transforms into > a SpanOrQuery of SpanTermQuery's, and then use a SpanNearQuery of > SpanWildcardQuery's? You could use a WildcardTermEnum.to generate the > list of terms for the SpanOrQuery. This would have some issues like > computing the idf as the sum of all the pattern-matched terms, but it > looks like that issue still exists with WildcardQuery too. I haven't > done much with SpanQuery's so this might not work out so simply, or be > acceptably efficient.
I did something very much like this in the surround query language. I used a SpanNearClauseFactory for use during rewrite of a truncated term with the following operations: - create a SpanNearClauseFactory from a field name and an indexreader. - add a weighted Term. This internally adds a corresponding SpanTermQuery, or increases the weight of an already added one. - add a weighted SpanNearQuery as a subquery (for nesting, but that probably doesn't fit your syntax above). - create a clause from the things added above. This normally produces a SpanOrQuery over the added terms and subqueries. Like the boolean clauses, it's advisable to set a maximum to the total number of SpanNearTerms used for a query. Regards, Paul Elschot. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]