That's good to know. If we go this route, we'll definitely either use the
factory, or follow its example. Thanks again
-Mike
On Mon, Jun 4, 2018 at 9:12 PM, Robert Muir wrote:
> There may be a traps, e.g. if you make such a filter with UnicodeSet,
> I think you really need to call .freeze() bef
There may be a traps, e.g. if you make such a filter with UnicodeSet,
I think you really need to call .freeze() before passing it to this
thing. I have not examined the sources in a while but I think this
might be similar to "compiling a regexp" in that you'll then get good
performance when its lat
Ah thanks! That's very good to know. As it is I realized we already have an
earlier component where we can handle this (we have a custom ICUTokenizer
rbbi and can just split on "^"). So many flexibility
-Mike
On Mon, Jun 4, 2018 at 10:53 AM, Robert Muir wrote:
> actually, you now can choose to
actually, you now can choose to ignore certain characters by using
unicode filtering mechanism.
This was added in https://issues.apache.org/jira/browse/LUCENE-8129
So apply a filter such as [^\^] and the filter will ignore ^.
On Mon, Jun 4, 2018 at 10:41 AM, Robert Muir wrote:
> This cannot be
This cannot be "tweaked" at runtime, it is implemented as custom normalization.
You can modify the sources / build your own ruleset or use a different
tokenfilter to normalize characters.
On Mon, Jun 4, 2018 at 9:07 AM, Michael Sokolov wrote:
> Hi, I'm using ICUFoldingFilter and for the most par
ormer
Uwe
-
Uwe Schindler
Achterdiek 19, D-28357 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de
> -Original Message-
> From: Allison, Timothy B. [mailto:talli...@mitre.org]
> Sent: Wednesday, August 16, 2017 4:41 AM
> To: java-user@lucene.apache.org
> Subject: RE: IC
never mind...overwriting service file...
-Original Message-
From: Allison, Timothy B. [mailto:talli...@mitre.org]
Sent: Tuesday, August 15, 2017 10:36 PM
To: java-user@lucene.apache.org
Subject: ICUFoldingFilter loading in IDE, but not jar ?!
In Intellij, when I run unit tests in my app