Aalap Parikh wrote:
same field name for all terms. lucene remembers terms positions withing document, so it gives you ability to use phrase query.Hi Volodymyr,
About the trick you described about wildcard search replacement, you mentioned:
So I found following workaround. I index this fieldas > sequence of terms, each of containing single
digit from > needed value. (For example I have
“123214213” value
that needs to be indexed. Then it will be indexed asterms.)
sequence of “1”,”2”,”3”,”2”,”1”,”4”,”2”,”1”,”3”
This can be done by custom Analyzer class.
When you index "123214213" as sequence of individual
terms, what is the name of the the fields you use for
each of the individual terms?
Also, can you post your custom analyzer class?
public class CharFilter extends TokenFilter {
private String text = null;
private int index = 0;
public CharFilter(TokenStream in) { super(in); }
public Token next() throws IOException { if (text == null) { Token token = input.next(); if (token == null) { return null; } text = token.termText(); index = 0; }
char c = text.charAt(index++); Token token = new Token(String.valueOf(c), index, index - 1); if (index >= text.length()) { text = null; }
return token; } }
public class CharSplitter extends StandardAnalyzer {
public TokenStream tokenStream(String fieldName, Reader reader) { TokenStream stream = super.tokenStream(fieldName, reader);
if ("file".equals(fieldName)) { return new CharFilter(stream); } return stream;
} }
here this field is named "file" but you could change it to arbitrary.
Also this analyzer is not used in any application, I wrote it only to measure search speed.
Thanks, Aalap.
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
regards, Volodymyr Bychkoviak
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]