Hi Luke,
For what you've described as a "bug" for NLPPOSTaggerOp, I do agree with you
that there could be a more elegant solution than simply synchronizing the
entire method. That has been said, IMHO, I don't see there is a thread-safe
issue. Lucene TokenFilters are not supposed to be shared am
Hello, Benoit.
I just came across
https://lucene.apache.org/core/8_0_0/analyzers-common/org/apache/lucene/analysis/miscellaneous/TypeAsSynonymFilterFactory.html
It sounds similar to what you asking, but it watches TypeAttribute only.
Also, spans are superseded with intervals
https://lucene.apache
Hi Luke,
Thank you for your work and information sharing. From my point of view
lemmatization is just a use case of text token annotation. I have been
working with Lucene since 2006 to index lexicographic and linguistic
data and I always miss the fact that (1) token attributes are not
search
Uwe, I think that Petko's question was about making sure that missing
values would be returned before non-missing values, even though some of
these non-missing values might be equal to Long.MIN_VALUE. Which isn't
possible today.
I agree with your recommendation against going with bytes given the
o
Hi,
Long.MIN_VALUE and Long.MAX_VALUE are the correct way for longs to sort.
In fact if you have Long.MIN_VALUE in your collection, empty values are
treated the same, but still empty value will appear at the wanted place.
In contrast to the default "0", it is not somewhere in the middle.
Beca