After applying the patch I was able to get the span positions for all the terms in the query. But now when I tried to access the positionSpans of each span term I cannot because they are stored in a package-private PositionSpan class in WeightedSpanTerm.java which prevents them from being visible outside the package. I was able to work around it in a by making PositionSpan utility class public. I moved PositionSpan class it into its own file in o.a.l.search.highlight package. I don't think this is the best solution, am open to other alternatives.
This is the content of o.a.l.search.highlight.PositionSpan.java // Utility class to store a Span public class PositionSpan { int start; int end; public PositionSpan(int start, int end) { this.start = start; this.end = end; } public int getEndPositionSpan() { return end; } public int getStartPositionSpan() { return start; } } -Jahangir On Fri, Jul 8, 2011 at 2:47 AM, Mark Miller <markrmil...@gmail.com> wrote: > > On Jul 7, 2011, at 5:14 PM, Jahangir Anwari wrote: > > > I did noticed a strange issue though. When the query is just a > > PhraseQuery(e.g. "everlasting glory"), getWeightedSpanTerms() returns all > > the span terms along with their span positions. But when the query is a > > BooleanQuery containing phrase and non-phrase terms(e.g. "everlasting > > glory"+unity), getWeightedSpanTerms() returns all the span terms but the > > span positions are returned only for the phrase terms(i.e. "everlasting" > and > > "glory"). Span positions for the non-phrase term(i.e. "unity") is empty. > Any > > ideas why this could be happening? > > > Positions are only collected for "position sensitive" queries. The > Highlighter framework that I plugged this into already runs through the > TokenStream one token at a time - to highlight a TermQuery, there is no need > to consult positions - just highlight every occurrence seen while marching > through the TokenStream. Which means there is no need to find those > positions either. > > If you are looking for those positions, here is a patch to calculate them > for TermQuerys as well. If you open a JIRA issue, seems like a reasonable > option to add to the class. > > Index: > lucene/contrib/highlighter/src/java/org/apache/lucene/search/highlight/WeightedSpanTermExtractor.java > =================================================================== > --- > lucene/contrib/highlighter/src/java/org/apache/lucene/search/highlight/WeightedSpanTermExtractor.java > (revision 1143407) > +++ > lucene/contrib/highlighter/src/java/org/apache/lucene/search/highlight/WeightedSpanTermExtractor.java > (working copy) > @@ -133,7 +133,7 @@ > sp.setBoost(query.getBoost()); > extractWeightedSpanTerms(terms, sp); > } else if (query instanceof TermQuery) { > - extractWeightedTerms(terms, query); > + extractWeightedSpanTerms(terms, new > SpanTermQuery(((TermQuery)query).getTerm())); > } else if (query instanceof SpanQuery) { > extractWeightedSpanTerms(terms, (SpanQuery) query); > } else if (query instanceof FilteredQuery) { > > > - Mark Miller > lucidimagination.com > > > > > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > >