Agreed. There is a simple fix. Index all the words. Also, stop using EdgeNgramFilter. That is only used for completion, not word search.
wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On Oct 23, 2021, at 4:31 AM, Dave <hastings.recurs...@gmail.com> wrote: > > Why ever would you not index less than three characters? > “To be or not to be” > Seems like a significant search > >> On Oct 23, 2021, at 7:28 AM, son hoang <sonhoan...@gmail.com> wrote: >> >> Yep, words less than 3 chars will not be indexed. But if "Al Abbas" text >> can be separated into a token "Abbas" (and "Al" but it is not counted as a >> token as it has 2 chars only) then we can apply OR condition in the query? >> >>> On 2021/10/22 14:37:51, Andy C <andycs...@gmail.com> wrote: >>> The issue looks to me to be with the use of EdgeNGramFilterFactory in your >>> field type. You have configured it with minGramSize="3" and have not >>> specified preserveOriginal="true". >>> >>> So words less than 3 characters will not be indexed, and therefore can't be >>> searched. >>> >>> See >>> https://solr.apache.org/guide/8_8/filter-descriptions.html#edge-n-gram-filter >>> >>> - Andy - >>> >>>> On Fri, Oct 22, 2021 at 10:12 AM son hoang <sonhoan...@gmail.com> wrote: >>>> >>>> Thanks, Thamiz >>>> >>>> It seems that I have index=StandardTokenizerFactory causing the issue >>>> >>>> I do not want to re-index. Is there any solution ? Should I have query >>>> "OR" so that the search can return "Al Abbas" when I have "Al Abbas" in >>>> the query field (eg: there is a OR match "Abbas" ? >>>> >>>> Thanks >>>> >>>> On 2021/10/21 07:56:20, Thamizhazhagan B <thamizhazhagan....@kp.org> >>>> wrote: >>>>> Hi, >>>>> >>>>> Create a copy field as below and use this copyfield in your query.. >>>>> >>>>> <copyField source="_name" dest="itemFullName"/> >>>>> <field name="itemFullName" type="itemFullName_type" stored="true" >>>> indexed="true" termVectors="true" termPositions="true" termOffsets="true"/> >>>>> >>>>> <fieldType name="itemFullName_type" class="solr.TextField" >>>> sortMissingLast="true" omitNorms="true" positionIncrementGap="100" >>>> multiValued="false"> >>>>> <analyzer type="index"> >>>>> <tokenizer class="solr.KeywordTokenizerFactory"/> >>>>> <filter class="solr.StopFilterFactory" words="stopwords.txt" >>>> ignoreCase="true"/> >>>>> <filter class="solr.LowerCaseFilterFactory"/> >>>>> </analyzer> >>>>> <analyzer type="query"> >>>>> <tokenizer class="solr.KeywordTokenizerFactory"/> >>>>> <filter class="solr.StopFilterFactory" words="stopwords.txt" >>>> ignoreCase="true"/> >>>>> <filter class="solr.SynonymFilterFactory" expand="true" >>>> ignoreCase="true" synonyms="synonyms.txt"/> >>>>> <filter class="solr.LowerCaseFilterFactory"/> >>>>> </analyzer> >>>>> </fieldType> >>>>> >>>>> Thanks, >>>>> Thamizh >>>>> >>>>> >>>>> -----Original Message----- >>>>> From: son hoang <sonhoan...@gmail.com> >>>>> Sent: Thursday, October 21, 2021 8:19 AM >>>>> To: users@solr.apache.org >>>>> Subject: Index for text with space >>>>> >>>>> Caution: This email came from outside Kaiser Permanente. Do not open >>>> attachments or click on links if you do not recognize the sender. >>>>> >>>>> ______________________________________________________________________ >>>>> Hello >>>>> >>>>> I have a config like this: >>>>> >>>>> <fieldtype name="tok" class="solr.TextField" positionIncrementGap="100"> >>>>> <analyzer type="index"> >>>>> <tokenizer class="solr.StandardTokenizerFactory"/> >>>>> <filter class="solr.ASCIIFoldingFilterFactory"/> >>>>> <filter class="solr.LowerCaseFilterFactory"/> >>>>> <filter class="solr.EdgeNGramFilterFactory" minGramSize="3" >>>>> maxGramSize="15"/> >>>>> </analyzer> >>>>> <analyzer type="query"> >>>>> <tokenizer class="solr.StandardTokenizerFactory" /> >>>>> <filter class="solr.ASCIIFoldingFilterFactory"/> >>>>> <filter class="solr.LowerCaseFilterFactory"/> >>>>> <!-- <filter class="solr.EdgeNGramFilterFactory" minGramSize="3" >>>>> maxGramSize="15"/> --> >>>>> </analyzer> >>>>> </fieldtype> >>>>> >>>>> Using this config: >>>>> >>>>> 1. When I search for "Abbas", the result for "Al Abbas" appears. >>>>> >>>>> 2. When I search for "Al Abbas" in the search field, I get no results. >>>>> >>>>> It seems that "Al Abbas" is not indexed. What I should do in the config >>>> so #2 can return the result >>>>> >>>>> Many thanks >>>>> NOTICE TO RECIPIENT: If you are not the intended recipient of this >>>> e-mail, you are prohibited from sharing, copying, or otherwise using or >>>> disclosing its contents. If you have received this e-mail in error, please >>>> notify the sender immediately by reply e-mail and permanently delete this >>>> e-mail and any attachments without reading, forwarding or saving them. >>>> v.173.295 Thank you. >>>>> >>>> >>>