[
https://issues.apache.org/jira/browse/LUCENE-7284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alan Woodward updated LUCENE-7284:
----------------------------------
Priority: Minor (was: Blocker)
Fix Version/s: 6.1
> UnsupportedOperationException wrt SpanNearQuery with Gap (Needed for Synonym
> Query Expansion)
> ---------------------------------------------------------------------------------------------
>
> Key: LUCENE-7284
> URL: https://issues.apache.org/jira/browse/LUCENE-7284
> Project: Lucene - Core
> Issue Type: Bug
> Components: core/search
> Reporter: Daniel Bigham
> Assignee: Alan Woodward
> Priority: Minor
> Fix For: 6.1
>
> Attachments: LUCENE-7284.patch
>
>
> I am trying to support synonyms on the query side by doing
> query expansion.
> For example, the query "open webpage" can be expanded if the following
> things are synonyms:
> "open" | "go to"
> This becomes the following: (I'm using both the stop word filter and the
> stemming filter)
> {code}
> spanNear(
> [
> spanOr([Title:open, Title:go]),
> Title:webpag
> ],
> 0,
> true
> )
> {code}
> Notice that "go to" became just "go", because apparently "to" is removed
> by the stop word filter.
> Interestingly, if you turn "go to webpage" into a phrase, you get "go ?
> webpage", but if you turn "go to" into a phrase, you just get "go",
> because apparently a trailing stop word in a PhraseQuery gets dropped.
> (there would actually be no way to represent the gap currently because
> it represents gaps implicitly via the position of the phrase tokens, and
> if there is no second token, there's no way to implicitly indicate that
> there is a gap there)
> The above query then fails to match "go to webpage", because "go to
> webpage" in the index tokenizes as "go _ webpage", and the query,
> because it lost its gap, tried to only match "go webpage".
> To try and work around that, I represent "go to" not as a phrase, but as
> a SpanNearQuery, like this:
> {code}
> spanNear(
> [
> spanOr(
> [
> Title:open,
> spanNear([Title:go, SpanGap(:1)], 0, true),
> ]
> ),
> Title:webpag
> ],
> 0,
> true
> )
> {code}
> However, when I run that query, I get the following:
> {code}
> A Java exception occurred: java.lang.UnsupportedOperationException
> at
> org.apache.lucene.search.spans.SpanNearQuery$GapSpans.positionsCost(SpanNearQuery.java:398)
> at
> org.apache.lucene.search.spans.ConjunctionSpans.asTwoPhaseIterator(ConjunctionSpans.java:96)
> at
> org.apache.lucene.search.spans.NearSpansOrdered.asTwoPhaseIterator(NearSpansOrdered.java:45)
> at
> org.apache.lucene.search.spans.ScoringWrapperSpans.asTwoPhaseIterator(ScoringWrapperSpans.java:88)
> at
> org.apache.lucene.search.ConjunctionDISI.addSpans(ConjunctionDISI.java:104)
> at
> org.apache.lucene.search.ConjunctionDISI.intersectSpans(ConjunctionDISI.java:82)
> at
> org.apache.lucene.search.spans.ConjunctionSpans.<init>(ConjunctionSpans.java:41)
> at
> org.apache.lucene.search.spans.NearSpansOrdered.<init>(NearSpansOrdered.java:54)
> at
> org.apache.lucene.search.spans.SpanNearQuery$SpanNearWeight.getSpans(SpanNearQuery.java:232)
> at
> org.apache.lucene.search.spans.SpanWeight.scorer(SpanWeight.java:134)
> at org.apache.lucene.search.spans.SpanWeight.scorer(SpanWeight.java:38)
> at org.apache.lucene.search.Weight.bulkScorer(Weight.java:135)
> {code}
> ... and when I look up that GapSpans class in SpanNearQuery.java, I see:
> {code}
> @Override
> public float positionsCost() {
> throw new UnsupportedOperationException();
> }
> {code}
> I asked this question on the mailing list on May 14 and was directed to
> submit a bug here.
> This issue is of relatively high priority for us, since this represents the
> most promising technique we have for supporting synonyms on top of Lucene.
> (since the SynonymFilter suffers serious issues wrt multi-word synonyms)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]