In Lucene, 3.4 I recently implemented "Translating PhraseQuery to SpanNearQuery" (see Lucene in Action, page 220) because I wanted _order_ to matter.
Here is my exact code called from getFieldsQuery once I know I'm looking at a PhraseQuery, but I think it is exactly from the book. static Query buildSpanNearQuery(PhraseQuery phraseQ, int slop) { Term[] terms = phraseQ.getTerms(); SpanTermQuery[] clauses = new SpanTermQuery[terms.length]; for (int i = 0; i < terms.length; i++) { clauses[i] = new SpanTermQuery(terms[i]); } SpanNearQuery query = new SpanNearQuery(clauses, slop, PHRASE_ORDER_MATTERS); return query; } I put in my own QueryParser and things looked good until I try a phrase with stop words. Using the old PhraseQuery I got results on a phrase with stop words without extending the slop, but with SpanNearQuery unless the query includes some slop, nothing is found. This conflicts with the typical use case of a user taking a phrase, pasting into the search bar with quotes and expecting to find his document. I can't just add some more slop, because it depends on how many stop words are in any sequence in the phrase. Any suggestions on how to solve the problem of combining the idea of SpanNear (so that words in order in a phrase is better) with text that has stop words removed, so that I can to support the simple use of quotes for exact quoted text matching? Any Ideas? -Paul