I don’t use that graph filter, but from the documentation it looks like a couple of other splitters may still be affecting those tokens (like splitOnCaseChange, splitOnNumerics, generateNumberParts).
Some of the apparent complexity here is using text-oriented fields and tokenizers but trying to capture what appear to be structured article identifiers. If you are specifically trying to find these in text content, you might be better served by a different tokenizer (maybe even the ClassicTokenizer) or a regex matcher. If you don’t actually need to find those numbers in text, you might be better served by using a plain string index field? On Wed, Apr 17, 2024 at 8:43 AM Carsten Klement <kont...@carsten-klement.de> wrote: > Hello, doesn't anyone have an idea? ☹ > > > > Am 10.04.24, 11:40 schrieb "Carsten Klement" <kont...@carsten-klement.de > <mailto:kont...@carsten-klement.de>>: > > > Hello, > I think I have a problem with configured Word separators. > > For example, I would like 3 items to be found when searching for 640, > 640-0 and two when searching for 640-01. > > #1 > artikelnummer_txt:"640*" AND lng:"de" > "docs":[{ > "artikelnummer_txt":"640-01" > },{ > "artikelnummer_txt":"640-02" > },{ > "artikelnummer_txt":"640-01LFM" > }] > > This is perfect, everything from the “artikelnummer_txt” field that starts > with 640 will be found. > > #2 > artikelnummer_txt:"640-0*" AND lng:"de" > "docs":[ ] > > > However, if I enter a "-" with a "0", no article is found. Here I expect > all three items > > > #3 > artikelnummer_txt:"640-01*" AND lng:"de" > "docs":[{ > "artikelnummer_txt":"640-01" > }] > > Here I only get one item, but I also expect two items. > > My configuration in schema.xml > <dynamicField name="*_txt" type="text_general" indexed="true" > stored="true"/> > <fieldType name="text_general" class="solr.TextField" > positionIncrementGap="100" multiValued="false"> > <analyzer type="index"> > <tokenizer name="standard"/> > <filter ignoreCase="true" words="stopwords.txt" name="stop"/> > <filter name="lowercase"/> > </analyzer> > > <analyzer type="query"> > <tokenizer name="standard"/> > <!-- Test START --> > <filter name="wordDelimiterGraph" types="wordDelimiters.txt"/> > <filter name="flattenGraph"/> > <!-- Test ENDE --> > <filter ignoreCase="true" words="stopwords.txt" name="stop"/> > <filter ignoreCase="true" synonyms="synonyms.txt" name="synonymGraph" > expand="true"/> > <filter name="lowercase"/> > </analyzer> > </fieldType> > > ### wordDelimiters.txt > # Don't split numbers at '$', '.' or ',' > $ => DIGIT > . => DIGIT > - => ALPHANUM > > > Maybe someone has an idea what I'm doing wrong? > > Thanks > Carsten > > > > > > > > > >