Hello, all -
I'd like to use Lucene's automaton/FST code to achieve fast fuzzy (OSA edit
distance up to 2) search for many (10k+) strings (knowledge base: kb) in
many large strings (docs).
Approach I was thinking of: create Levenshtein FST with all paths
associated with unedited form for each kb
Solution (I think): create a weight for the searcher and then call "scorer"
from that for each LeafReaderContext:
Weight searcherWeight = searcher.createWeight(filter, false);
for (LeafReaderContext ctx : searcher.getIndexReader().leaves()) {
Scorer leafReaderContextScorer = searche
I finally dug into this, and it turns out the nightly benchmark I run had
bad bottlenecks such that it couldn't feed documents quickly enough to
Lucene to take advantage of the concurrent hardware in beast2.
I fixed that and just re-ran the nightly run and it shows good gains:
https://plus.google.
Hello,
I have some difficulties to find out in which release some API changes were
made and information about how to migrate.
For example I was not able to find:
1) when the method FieldType.setIndexed(true) was dropped and how to change
coding
2) Same for the method Query.extractTerms
I would a