Mike Lissner created SOLR-10102:
-----------------------------------

             Summary: SynonymFilterFactory in example file is on query not index
                 Key: SOLR-10102
                 URL: https://issues.apache.org/jira/browse/SOLR-10102
             Project: Solr
          Issue Type: Bug
      Security Level: Public (Default Security Level. Issues are Public)
          Components: examples
    Affects Versions: 6.4.1, 4.10.2
            Reporter: Mike Lissner


The example files for both 4.10.2 and 6.4.1 have entries like these:

{code:xml}
  <fieldType name="text_general" class="solr.TextField" 
positionIncrementGap="100" multiValued="true">
    <analyzer type="index">
      <tokenizer class="solr.StandardTokenizerFactory"/>
      <filter class="solr.StopFilterFactory" words="stopwords.txt" 
ignoreCase="true"/>
      <filter class="solr.LowerCaseFilterFactory"/>
    </analyzer>
    <analyzer type="query">
      <tokenizer class="solr.StandardTokenizerFactory"/>
      <filter class="solr.StopFilterFactory" words="stopwords.txt" 
ignoreCase="true"/>
      <!-- THIS IS WRONG, RIGHT? -->
      <filter class="solr.SynonymFilterFactory" expand="true" ignoreCase="true" 
synonyms="synonyms.txt"/>
      <filter class="solr.LowerCaseFilterFactory"/>
    </analyzer>
  </fieldType>
{code}

You'll note that the synonym filter is applied at query time, which will 
totally fail. Even [the 
docs|https://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.SynonymFilterFactory]
 say:

bq. The recommended approach for dealing with synonyms like this, is to expand 
the synonym when indexing.

Can we fix this? Or is there a reason why this is like this? As I understand 
it, having synonyms on the query means that things just won't be returned that 
should be. 

For example, we have the token "5" set up with a synonym to the word "five". 
So, if somebody searches for 5, the query filter will expand it to "5 AND 
five", which, sure enough, the index doesn't match....no results. So...instead 
of expanding the result set, like synonyms are supposed to do, this actively 
contracts it.

I hope my frustration in this is misplaced, but if I'm right about this bug, 
can I say that this is the kind of thing that makes Solr super frustrating to 
use? 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to