Is there any way in the query so that I do not need to reindex the whole data?
On 2021/10/23 15:39:18, Walter Underwood <wun...@wunderwood.org> wrote: > Agreed. There is a simple fix. Index all the words. Also, stop using > EdgeNgramFilter. > That is only used for completion, not word search. > > wunder > Walter Underwood > wun...@wunderwood.org > http://observer.wunderwood.org/ (my blog) > > > On Oct 23, 2021, at 4:31 AM, Dave <hastings.recurs...@gmail.com> wrote: > > > > Why ever would you not index less than three characters? > > “To be or not to be” > > Seems like a significant search > > > >> On Oct 23, 2021, at 7:28 AM, son hoang <sonhoan...@gmail.com> wrote: > >> > >> Yep, words less than 3 chars will not be indexed. But if "Al Abbas" text > >> can be separated into a token "Abbas" (and "Al" but it is not counted as > >> a token as it has 2 chars only) then we can apply OR condition in the > >> query? > >> > >>> On 2021/10/22 14:37:51, Andy C <andycs...@gmail.com> wrote: > >>> The issue looks to me to be with the use of EdgeNGramFilterFactory in your > >>> field type. You have configured it with minGramSize="3" and have not > >>> specified preserveOriginal="true". > >>> > >>> So words less than 3 characters will not be indexed, and therefore can't > >>> be > >>> searched. > >>> > >>> See > >>> https://solr.apache.org/guide/8_8/filter-descriptions.html#edge-n-gram-filter > >>> > >>> - Andy - > >>> > >>>> On Fri, Oct 22, 2021 at 10:12 AM son hoang <sonhoan...@gmail.com> wrote: > >>>> > >>>> Thanks, Thamiz > >>>> > >>>> It seems that I have index=StandardTokenizerFactory causing the issue > >>>> > >>>> I do not want to re-index. Is there any solution ? Should I have query > >>>> "OR" so that the search can return "Al Abbas" when I have "Al Abbas" in > >>>> the query field (eg: there is a OR match "Abbas" ? > >>>> > >>>> Thanks > >>>> > >>>> On 2021/10/21 07:56:20, Thamizhazhagan B <thamizhazhagan....@kp.org> > >>>> wrote: > >>>>> Hi, > >>>>> > >>>>> Create a copy field as below and use this copyfield in your query.. > >>>>> > >>>>> <copyField source="_name" dest="itemFullName"/> > >>>>> <field name="itemFullName" type="itemFullName_type" stored="true" > >>>> indexed="true" termVectors="true" termPositions="true" > >>>> termOffsets="true"/> > >>>>> > >>>>> <fieldType name="itemFullName_type" class="solr.TextField" > >>>> sortMissingLast="true" omitNorms="true" positionIncrementGap="100" > >>>> multiValued="false"> > >>>>> <analyzer type="index"> > >>>>> <tokenizer class="solr.KeywordTokenizerFactory"/> > >>>>> <filter class="solr.StopFilterFactory" words="stopwords.txt" > >>>> ignoreCase="true"/> > >>>>> <filter class="solr.LowerCaseFilterFactory"/> > >>>>> </analyzer> > >>>>> <analyzer type="query"> > >>>>> <tokenizer class="solr.KeywordTokenizerFactory"/> > >>>>> <filter class="solr.StopFilterFactory" words="stopwords.txt" > >>>> ignoreCase="true"/> > >>>>> <filter class="solr.SynonymFilterFactory" expand="true" > >>>> ignoreCase="true" synonyms="synonyms.txt"/> > >>>>> <filter class="solr.LowerCaseFilterFactory"/> > >>>>> </analyzer> > >>>>> </fieldType> > >>>>> > >>>>> Thanks, > >>>>> Thamizh > >>>>> > >>>>> > >>>>> -----Original Message----- > >>>>> From: son hoang <sonhoan...@gmail.com> > >>>>> Sent: Thursday, October 21, 2021 8:19 AM > >>>>> To: users@solr.apache.org > >>>>> Subject: Index for text with space > >>>>> > >>>>> Caution: This email came from outside Kaiser Permanente. Do not open > >>>> attachments or click on links if you do not recognize the sender. > >>>>> > >>>>> ______________________________________________________________________ > >>>>> Hello > >>>>> > >>>>> I have a config like this: > >>>>> > >>>>> <fieldtype name="tok" class="solr.TextField" positionIncrementGap="100"> > >>>>> <analyzer type="index"> > >>>>> <tokenizer class="solr.StandardTokenizerFactory"/> > >>>>> <filter class="solr.ASCIIFoldingFilterFactory"/> > >>>>> <filter class="solr.LowerCaseFilterFactory"/> > >>>>> <filter class="solr.EdgeNGramFilterFactory" minGramSize="3" > >>>>> maxGramSize="15"/> > >>>>> </analyzer> > >>>>> <analyzer type="query"> > >>>>> <tokenizer class="solr.StandardTokenizerFactory" /> > >>>>> <filter class="solr.ASCIIFoldingFilterFactory"/> > >>>>> <filter class="solr.LowerCaseFilterFactory"/> > >>>>> <!-- <filter class="solr.EdgeNGramFilterFactory" minGramSize="3" > >>>>> maxGramSize="15"/> --> > >>>>> </analyzer> > >>>>> </fieldtype> > >>>>> > >>>>> Using this config: > >>>>> > >>>>> 1. When I search for "Abbas", the result for "Al Abbas" appears. > >>>>> > >>>>> 2. When I search for "Al Abbas" in the search field, I get no results. > >>>>> > >>>>> It seems that "Al Abbas" is not indexed. What I should do in the config > >>>> so #2 can return the result > >>>>> > >>>>> Many thanks > >>>>> NOTICE TO RECIPIENT: If you are not the intended recipient of this > >>>> e-mail, you are prohibited from sharing, copying, or otherwise using or > >>>> disclosing its contents. If you have received this e-mail in error, > >>>> please > >>>> notify the sender immediately by reply e-mail and permanently delete this > >>>> e-mail and any attachments without reading, forwarding or saving them. > >>>> v.173.295 Thank you. > >>>>> > >>>> > >>> > >