It seems to me that to constrain the search to a sentence this way, you'd have to override 'getPositionIncrementGap', which would then break phrase searches across the field values (sentences).
Peter On Wed, Jul 20, 2011 at 11:33 AM, <dar...@ontrenet.com> wrote: > > I just parse the text into sentences and put those in a multi-valued field > and then search that. > > On Wed, 20 Jul 2011 11:27:38 -0400, Peter Keegan <peterlkee...@gmail.com> > wrote: > > I have browsed many suggestions on how to implement 'search within a > > sentence', but all seem to have drawbacks. For example, from > > > > http://lucene.472066.n3.nabble.com/Issue-with-sentence-specific-search-td1644352.html#a1645072 > > > > Steve Rowe writes: > > > > ---------- > > One common technique, instead of using a larger-than-normal position > > increment gap between sentences, is using a sentence boundary token like > > '$' > > or something else that won't ever itself be the target of search. > Quoting > > from a post Mark Miller made to the lucene-user list last year < > > > > http://www.lucidimagination.com/search/document/c9641cbb1a3bf928/multiline_regex_with_lucene > >>): > > > > First you inject special marker tokens as your paragraph/ > > sentence markers, then you use a SpanNotQuery that looks > > for a SpanNearQuery that doesn't intersect with a > > SpanTermQuery containing the special marker term. > > > > Mark's suggestion would work for your within-sentence case, and for the > > case > > where you don't care about sentence boundaries, you can use > SpanNearQuery > > without the SpanNotQuery. > > ---------- > > > > The problem with the last part is that the SpanNearQuery would have to > have > > a slop of 1 in order to accomodate the marker token between sentences. > This > > could result in incorrect matches if the a slop of 0 is intended. > Another > > suggestion was to overlap the marker token with the first or last token > of > > the sentence, but the SpanNotQuery would always exclude any terms in the > > query that are at the intersection. Mark Miller's 'SpanWithinQuery' > patch > > seems to have the same issue. > > > > Has anyone implemented a solution that works for both in-sentence and > > across > > sentence boundaries? > > > > Thanks, > > Peter > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > >