Agreed. There is a simple fix. Index all the words. Also, stop using 
EdgeNgramFilter.
That is only used for completion, not word search.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Oct 23, 2021, at 4:31 AM, Dave <hastings.recurs...@gmail.com> wrote:
> 
> Why ever would you not index less than three characters?
> “To be or not to be”
> Seems like a significant search 
> 
>> On Oct 23, 2021, at 7:28 AM, son hoang <sonhoan...@gmail.com> wrote:
>> 
>> Yep, words less than 3 chars will not be indexed. But if "Al Abbas" text 
>> can be separated into a token "Abbas" (and "Al"  but it is not counted as a 
>> token as it has 2 chars only) then we can apply OR condition in the query?  
>> 
>>> On 2021/10/22 14:37:51, Andy C <andycs...@gmail.com> wrote: 
>>> The issue looks to me to be with the use of EdgeNGramFilterFactory in your
>>> field type. You have configured it with minGramSize="3" and have not
>>> specified preserveOriginal="true".
>>> 
>>> So words less than 3 characters will not be indexed, and therefore can't be
>>> searched.
>>> 
>>> See
>>> https://solr.apache.org/guide/8_8/filter-descriptions.html#edge-n-gram-filter
>>> 
>>> - Andy -
>>> 
>>>> On Fri, Oct 22, 2021 at 10:12 AM son hoang <sonhoan...@gmail.com> wrote:
>>>> 
>>>> Thanks, Thamiz
>>>> 
>>>> It seems that I have index=StandardTokenizerFactory causing the issue
>>>> 
>>>> I do not want to re-index. Is there any solution ? Should I have query
>>>> "OR" so that the search can return  "Al Abbas" when I have  "Al Abbas" in
>>>> the query field  (eg: there is a OR match "Abbas" ?
>>>> 
>>>> Thanks
>>>> 
>>>> On 2021/10/21 07:56:20, Thamizhazhagan B <thamizhazhagan....@kp.org>
>>>> wrote:
>>>>> Hi,
>>>>> 
>>>>> Create a copy field as below and use this copyfield in your query..
>>>>> 
>>>>> <copyField source="_name" dest="itemFullName"/>
>>>>> <field name="itemFullName" type="itemFullName_type" stored="true"
>>>> indexed="true" termVectors="true" termPositions="true" termOffsets="true"/>
>>>>> 
>>>>> <fieldType name="itemFullName_type" class="solr.TextField"
>>>> sortMissingLast="true" omitNorms="true" positionIncrementGap="100"
>>>> multiValued="false">
>>>>>   <analyzer type="index">
>>>>>     <tokenizer class="solr.KeywordTokenizerFactory"/>
>>>>>     <filter class="solr.StopFilterFactory" words="stopwords.txt"
>>>> ignoreCase="true"/>
>>>>>     <filter class="solr.LowerCaseFilterFactory"/>
>>>>>   </analyzer>
>>>>>   <analyzer type="query">
>>>>>     <tokenizer class="solr.KeywordTokenizerFactory"/>
>>>>>     <filter class="solr.StopFilterFactory" words="stopwords.txt"
>>>> ignoreCase="true"/>
>>>>>     <filter class="solr.SynonymFilterFactory" expand="true"
>>>> ignoreCase="true" synonyms="synonyms.txt"/>
>>>>>     <filter class="solr.LowerCaseFilterFactory"/>
>>>>>   </analyzer>
>>>>> </fieldType>
>>>>> 
>>>>> Thanks,
>>>>> Thamizh
>>>>> 
>>>>> 
>>>>> -----Original Message-----
>>>>> From: son hoang <sonhoan...@gmail.com>
>>>>> Sent: Thursday, October 21, 2021 8:19 AM
>>>>> To: users@solr.apache.org
>>>>> Subject: Index for text with space
>>>>> 
>>>>> Caution: This email came from outside Kaiser Permanente. Do not open
>>>> attachments or click on links if you do not recognize the sender.
>>>>> 
>>>>> ______________________________________________________________________
>>>>> Hello
>>>>> 
>>>>> I have a config like this:
>>>>> 
>>>>> <fieldtype name="tok" class="solr.TextField" positionIncrementGap="100">
>>>>>           <analyzer type="index">
>>>>>               <tokenizer class="solr.StandardTokenizerFactory"/>
>>>>>               <filter class="solr.ASCIIFoldingFilterFactory"/>
>>>>>               <filter class="solr.LowerCaseFilterFactory"/>
>>>>>       <filter class="solr.EdgeNGramFilterFactory" minGramSize="3"
>>>>> maxGramSize="15"/>
>>>>>           </analyzer>
>>>>>           <analyzer type="query">
>>>>>               <tokenizer class="solr.StandardTokenizerFactory" />
>>>>>               <filter class="solr.ASCIIFoldingFilterFactory"/>
>>>>>               <filter class="solr.LowerCaseFilterFactory"/>
>>>>>       <!-- <filter class="solr.EdgeNGramFilterFactory" minGramSize="3"
>>>>> maxGramSize="15"/> -->
>>>>>           </analyzer>
>>>>>   </fieldtype>
>>>>> 
>>>>> Using this config:
>>>>> 
>>>>> 1. When I search for "Abbas", the result for "Al Abbas" appears.
>>>>> 
>>>>> 2. When I search for "Al Abbas" in the search field, I get no results.
>>>>> 
>>>>> It seems that "Al Abbas" is not indexed. What I should do in the config
>>>> so #2 can return the result
>>>>> 
>>>>> Many thanks
>>>>> NOTICE TO RECIPIENT:  If you are not the intended recipient of this
>>>> e-mail, you are prohibited from sharing, copying, or otherwise using or
>>>> disclosing its contents.  If you have received this e-mail in error, please
>>>> notify the sender immediately by reply e-mail and permanently delete this
>>>> e-mail and any attachments without reading, forwarding or saving them.
>>>> v.173.295  Thank you.
>>>>> 
>>>> 
>>> 

Reply via email to