You can pre process the query to remove anything not indexed (less than 3 characters) but that initial scheme decision was a mistake, and should be remedied and reindexed.
> On Oct 25, 2021, at 8:36 AM, son hoang <sonhoan...@gmail.com> wrote: > > Is there any way in the query so that I do not need to reindex the whole > data? > >> On 2021/10/23 15:39:18, Walter Underwood <wun...@wunderwood.org> wrote: >> Agreed. There is a simple fix. Index all the words. Also, stop using >> EdgeNgramFilter. >> That is only used for completion, not word search. >> >> wunder >> Walter Underwood >> wun...@wunderwood.org >> http://observer.wunderwood.org/ (my blog) >> >>>> On Oct 23, 2021, at 4:31 AM, Dave <hastings.recurs...@gmail.com> wrote: >>> >>> Why ever would you not index less than three characters? >>> “To be or not to be” >>> Seems like a significant search >>> >>>> On Oct 23, 2021, at 7:28 AM, son hoang <sonhoan...@gmail.com> wrote: >>>> >>>> Yep, words less than 3 chars will not be indexed. But if "Al Abbas" text >>>> can be separated into a token "Abbas" (and "Al" but it is not counted as >>>> a token as it has 2 chars only) then we can apply OR condition in the >>>> query? >>>> >>>>> On 2021/10/22 14:37:51, Andy C <andycs...@gmail.com> wrote: >>>>> The issue looks to me to be with the use of EdgeNGramFilterFactory in your >>>>> field type. You have configured it with minGramSize="3" and have not >>>>> specified preserveOriginal="true". >>>>> >>>>> So words less than 3 characters will not be indexed, and therefore can't >>>>> be >>>>> searched. >>>>> >>>>> See >>>>> https://solr.apache.org/guide/8_8/filter-descriptions.html#edge-n-gram-filter >>>>> >>>>> - Andy - >>>>> >>>>>> On Fri, Oct 22, 2021 at 10:12 AM son hoang <sonhoan...@gmail.com> wrote: >>>>>> >>>>>> Thanks, Thamiz >>>>>> >>>>>> It seems that I have index=StandardTokenizerFactory causing the issue >>>>>> >>>>>> I do not want to re-index. Is there any solution ? Should I have query >>>>>> "OR" so that the search can return "Al Abbas" when I have "Al Abbas" in >>>>>> the query field (eg: there is a OR match "Abbas" ? >>>>>> >>>>>> Thanks >>>>>> >>>>>> On 2021/10/21 07:56:20, Thamizhazhagan B <thamizhazhagan....@kp.org> >>>>>> wrote: >>>>>>> Hi, >>>>>>> >>>>>>> Create a copy field as below and use this copyfield in your query.. >>>>>>> >>>>>>> <copyField source="_name" dest="itemFullName"/> >>>>>>> <field name="itemFullName" type="itemFullName_type" stored="true" >>>>>> indexed="true" termVectors="true" termPositions="true" >>>>>> termOffsets="true"/> >>>>>>> >>>>>>> <fieldType name="itemFullName_type" class="solr.TextField" >>>>>> sortMissingLast="true" omitNorms="true" positionIncrementGap="100" >>>>>> multiValued="false"> >>>>>>> <analyzer type="index"> >>>>>>> <tokenizer class="solr.KeywordTokenizerFactory"/> >>>>>>> <filter class="solr.StopFilterFactory" words="stopwords.txt" >>>>>> ignoreCase="true"/> >>>>>>> <filter class="solr.LowerCaseFilterFactory"/> >>>>>>> </analyzer> >>>>>>> <analyzer type="query"> >>>>>>> <tokenizer class="solr.KeywordTokenizerFactory"/> >>>>>>> <filter class="solr.StopFilterFactory" words="stopwords.txt" >>>>>> ignoreCase="true"/> >>>>>>> <filter class="solr.SynonymFilterFactory" expand="true" >>>>>> ignoreCase="true" synonyms="synonyms.txt"/> >>>>>>> <filter class="solr.LowerCaseFilterFactory"/> >>>>>>> </analyzer> >>>>>>> </fieldType> >>>>>>> >>>>>>> Thanks, >>>>>>> Thamizh >>>>>>> >>>>>>> >>>>>>> -----Original Message----- >>>>>>> From: son hoang <sonhoan...@gmail.com> >>>>>>> Sent: Thursday, October 21, 2021 8:19 AM >>>>>>> To: users@solr.apache.org >>>>>>> Subject: Index for text with space >>>>>>> >>>>>>> Caution: This email came from outside Kaiser Permanente. Do not open >>>>>> attachments or click on links if you do not recognize the sender. >>>>>>> >>>>>>> ______________________________________________________________________ >>>>>>> Hello >>>>>>> >>>>>>> I have a config like this: >>>>>>> >>>>>>> <fieldtype name="tok" class="solr.TextField" positionIncrementGap="100"> >>>>>>> <analyzer type="index"> >>>>>>> <tokenizer class="solr.StandardTokenizerFactory"/> >>>>>>> <filter class="solr.ASCIIFoldingFilterFactory"/> >>>>>>> <filter class="solr.LowerCaseFilterFactory"/> >>>>>>> <filter class="solr.EdgeNGramFilterFactory" minGramSize="3" >>>>>>> maxGramSize="15"/> >>>>>>> </analyzer> >>>>>>> <analyzer type="query"> >>>>>>> <tokenizer class="solr.StandardTokenizerFactory" /> >>>>>>> <filter class="solr.ASCIIFoldingFilterFactory"/> >>>>>>> <filter class="solr.LowerCaseFilterFactory"/> >>>>>>> <!-- <filter class="solr.EdgeNGramFilterFactory" minGramSize="3" >>>>>>> maxGramSize="15"/> --> >>>>>>> </analyzer> >>>>>>> </fieldtype> >>>>>>> >>>>>>> Using this config: >>>>>>> >>>>>>> 1. When I search for "Abbas", the result for "Al Abbas" appears. >>>>>>> >>>>>>> 2. When I search for "Al Abbas" in the search field, I get no results. >>>>>>> >>>>>>> It seems that "Al Abbas" is not indexed. What I should do in the config >>>>>> so #2 can return the result >>>>>>> >>>>>>> Many thanks >>>>>>> NOTICE TO RECIPIENT: If you are not the intended recipient of this >>>>>> e-mail, you are prohibited from sharing, copying, or otherwise using or >>>>>> disclosing its contents. If you have received this e-mail in error, >>>>>> please >>>>>> notify the sender immediately by reply e-mail and permanently delete this >>>>>> e-mail and any attachments without reading, forwarding or saving them. >>>>>> v.173.295 Thank you. >>>>>>> >>>>>> >>>>> >> >>