In my solr 9.2 schema, I am leveraging the dynamicField
which tokenizes with solr.StandardTokenizerFactory for index and query.
However, when I query with, for example,
metadata_txt:XYZ.tif
I see many more hits than I expect. When I add debug=true to the query, I
see:
metadata_txt:XYZ.tif
met
gt; > ~~Bill
> >
>
>
> --
> Sincerely yours
> Mikhail Khludnev
> https://t.me/MUST_SEARCH
> A caveat: Cyrillic!
>
--
Human wheels spin round and round
While the clock keeps the pace... -- John Mellencamp
Bill TantzenUniversity of Minnesota Libraries
612-626-9949 (U of M)612-325-1777 (cell)
on dot is
> what I expect from StandardTokenizer.
>
> On Tue, May 2, 2023 at 8:48 PM Bill Tantzen
> wrote:
>
> > Mikhail,
> > Thanks for the quick reply. Here is the parser info:
> >
> > LuceneQParser
> >
> > ~~Bill
> >
> > On Tue, May
ically would maintain the non characters but also lead to more
> strict search constraints. If you tried this you need to re index a couple
> documents to
> Make sure you are getting what you want.
>
> -Dave
>
> > On May 2, 2023, at 2:22 PM, Bill Tantzen
> wrote:
>
d in on this!
~~Bill
On Tue, May 2, 2023 at 3:56 PM Shawn Heisey wrote:
> On 5/2/23 13:16, Bill Tantzen wrote:
> > This tokenizer splits the text field into tokens, treating whitespace and
> > punctuation as delimiters.
> > Delimiter characters are discarded, with the foll
pected!
~~Bill
On Wed, May 3, 2023 at 10:04 AM Shawn Heisey wrote:
> On 5/2/23 15:30, Bill Tantzen wrote:
> > This works as I expected:
> > ab00c.tif -- tokenizes as it should with a value of ab00c.tif
> >
> > This doesn't work as I expected
> > ab00
e/lucene/issues/12264.
> > Let's look at what devs say.
> >
> > On Wed, May 3, 2023 at 6:13 PM Bill Tantzen
> > wrote:
> >
> > > Shawn,
> > > No, email addresses are not preserved -- from the docs:
> > >
> > >
> > >