Re: Searching and WordDelimiterGraphFilterFactory

Shaun Campbell Tue, 09 Mar 2021 13:21:17 -0800

Hi Susmit

That didn't seem to work. Don't know if I was doing something wrong. I
ended up writing a regex to split the incoming string into strings of
numbers and letters and build up the query manually. It's all working now.


Thanks
Shaun

On Tue, 9 Mar 2021 at 16:50, Susmit <shukla.sus...@gmail.com> wrote:

> q.op = AND could be useful. the parts broken down by WDgff joined by ‘AND’
>
> Sent from my iPhone
>
> > On Mar 9, 2021, at 3:07 AM, Shaun Campbell <campbell.sh...@gmail.com>
> wrote:
> >
> > Hi
> >
> > I'm trying to produce an autosuggestion field for project ids using
> > ngrams and WordDelimiterGraphFilterFactory to split on word number
> > boundaries.
> >
> > The ids have various formats ranging from nihr123456, 12/34/567,
> > DRF-2018-11-ST2-062.
> >
> > What I'm trying to do is allow the user to enter the number parts or the
> > alphabetical characters, or both and match all. The basic autosuggestion
> is
> > working but I have an issue where the query is matching some but not all
> of
> > the component parts. For example:
> >
> > I enter DRF-2018-11 and it matches:
> >
> > DRF-2018-11-ST2-062
> > PB-PG-0909-20188
> > CS-2018-18-ST2-005
> >
> >
> > The first one is correct because it matches the DRF, the 2018 and the 11.
> > The second and third ones I don't want because there's no DRF, or 11 in
> the
> > ids.  Is there any way to get around this problem in Solr configuration,
> or
> > do I have to split the id manually in code and construct a query where
> the
> > id is DRF AND id is 2018 AND id is 11?
> >
> > Here is my field type configuration:
> >
> > <fieldType name="ngram_award_id" class="solr.TextField"
> > positionIncrementGap="100" autoGeneratePhraseQueries="true">
> > <analyzer type="index">
> >
> > <tokenizer class="solr.StandardTokenizerFactory"/>
> > <filter class="solr.WordDelimiterGraphFilterFactory"
> generateWordParts="1"
> > generateNumberParts="1" catenateWords="0" catenateNumbers="0"
> > catenateAll="0" splitOnCaseChange="0" splitOnNumerics="1"/>
> > <filter class="solr.LowerCaseFilterFactory"/>
> > <filter class="solr.FlattenGraphFilterFactory" />
> > <filter class="solr.EdgeNGramFilterFactory" minGramSize="3"
> > maxGramSize="7"/>
> >
> > </analyzer>
> > <analyzer type="query">
> >
> > <tokenizer class="solr.StandardTokenizerFactory"/>
> > <filter class="solr.WordDelimiterGraphFilterFactory"
> generateWordParts="1"
> > generateNumberParts="1" catenateWords="0" catenateNumbers="0"
> > catenateAll="0" splitOnCaseChange="0"  splitOnNumerics="1"/>
> > <filter class="solr.LowerCaseFilterFactory"/>
> >
> > </analyzer>
> > </fieldType>
> >
> > Thanks
> > Shaun
>

Re: Searching and WordDelimiterGraphFilterFactory

Reply via email to