Problem with SimpleAnalyzer! It ignores digits.
For text "customer 123 found" it will take only "customer" and "found", but
will ignore "123". StandardAnalyzer handles OK the digits but has the dots
problem, I mentioned before.
Is there an understandable guide how to write my own Analyzer - a h
Lucene In Action has an example of creating a synonymanalyzer that
you can adapt. The general idea is to subclass from Analyzer and
implement the required functions, perhaps wrapping a Tokenizer
in a bunch of Filters.
You might be able to crib some ideas from
solr.analysis.WordDelimiterFilter
Best
I am overriding the coord method in my customSimilairty Class and it will be
return (float)overlap/(float)maxOverlap;
I'll update you.
Thanks for your help
--
View this message in context:
http://lucene.472066.n3.nabble.com/Search-Score-percentage-Should-not-be-relative-to-the-highest-scor
OK, I succeeded to write an Analyzer I need. I can't say that I understood
all Lucene Analyzer-Tokenizer-Filter logic, but here's attached MyAnalyzer.
Hope it will help somebody else.
import java.io.Reader;
import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.CharTokeni
I would think this is more like it.
But the essential thing, so it seems to me, is whether there is a
requirement for a serialised index, i.e. a more permanent record, aside from
the saved document.
Then, if there is a penalty to creating the index compared to regex,
stringsearch or so, it is justi
Hi,
I try to solve the big index problem by zipping it and reading via VFS. I
suppose I'll need to write my own Directory implementation. Did you guys
succeed to do it? How the search speed is?
--
View this message in context:
http://lucene.472066.n3.nabble.com/Opening-an-index-directory-inside
Did not work,
I am using my own Similarity and the coord method is not called, because the
disableCoord variable is set to true from FuzzyQuery
public Similarity getSimilarity(Searcher searcher) {
Similarity result = super.getSimilarity(searcher);
if (disableCoord) {