Index all diagrams. If you use a dictionary then there is a lot of work to maintain it. Also this does not translate well to other languages. The downside to this is having partial token hits which decrease precision. But, usually people who are looking for "well being" or "wellbeing" will not expect to look for 'well*' in documents. You would have to measure the results in your data. An obvious example would be first and last names.
For every stream of tokens: t1 t2 t3...tn, you would index t1t2 t2t3...tn-1tn as well as the normal tokens. Index them into a separate non-stored field to allow control at query time. On Tue, Aug 15, 2023 at 8:08 PM Ramkumar Krishnamoorthy < ramkumar1...@gmail.com> wrote: > Hi All, > > I am struggling to find the right filter that can make it work for search > queries like "well being" and "play space" to be able to match terms like > wellbeing and playspace in documents. > > Tried to make it work with wordDelimiterGraph. But that only works if the > word in the document is "WellBeing". Another option I am considering is > using DictionaryCompoundWordTokenFilterFactory but I need to find a > dictionary file for English that I can pass to it.. > > Any suggestions on how this can be handled? > > Thanks, > Kumar >