Hi, at my workplace we have been facing a curious dilemma. We have a very problematic word that we want to remove from the queries. We decided to add it to our stopwords list, but querying the word alone would not remove the word as otherwise the query would be empty. This is understandable behavior, if you only have stopwords in the query then they hold value for the query so we should search against them. However, we really need to get rid of this particular word.
We found that the stopfilter does indeed remove single tokens, but the token survives when queried with the eDisMax. Looking at the documentation of this query parser we found that it kind of overrides the behavior of the stop filter giving us the behavior detailed above. We tried to use the stopwords flag to specify that we don't want that overriding but it doesn't work. So, we tried to make our custom stopwords filter and as we were doing it, we found that using two consecutive stopfilters would indeed remove the word. We can even ingest different lists of words to those filters so we only always delete the problematic word, leaving alone the others when queried alone. *Why does this work like this? * Over here I will let our query pipeline. <analyzer type="query"> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="dropwords.txt"/> <filter class="solr.ASCIIFoldingFilterFactory"/> <filter class="solr.EnglishMinimalStemFilterFactory"/> </analyzer> stopwords.txt have the common english stopwords and dropwords.txt have just "the" (for example). So any query with "of" or "a" would keep the token, but using "the" will not. We are using Solr 7.7 btw. Thank you so much, I would like to know your input on this -- *Ricardo Soto Estévez* <ricar...@empathy.co> Backend Engineer [image: Empathy Logo] Privacy Policy <https://www.empathy.co/privacy-policy/>