Ahh, I knew I saw it somewhere, then I lost it again... :) I guess the
name is not quite intuitive, but anyway thanks a lot!
and I'm just wondering if there is a tokenizer
that would return me the whole text.
KeywordTokenizer does this.
-
> and I'm just wondering if there is a tokenizer
> that would return me the whole text.
KeywordTokenizer does this.
-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java
Hi there,
I have been recently trying to build a lucene index out of ngrams and
seem to have stumbled on to a number of issues. I first tried to use the
NGramTokenizer, but that thing apparently only takes the first 1024
characters to tokenize. Having searched around the web, I came upon this
Shai, you got it right. I want to be able to send "b bb" through the QP
with my custom analyzer, and get back "(b b$) (b bb$)" -- 2 terms with 2
tokens in the same position for each.
I want this to be a native product of the engine, as opposed to forcing
this from the query end. I'm using diff