I meant to say. Now my analser chain looks like this.
I have added this custom filter at the end of my query. Now only my first
document is getting indexed.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Writing-a-TokenConcatenateFilter-junk-characters-appearing-on-output-tp3383684p3384379.html
Sent from the Lucene - Java Users
Thanks a million Uwe. That fixes it.
On Sat, Oct 1, 2011 at 4:16 AM, Uwe Schindler [via Lucene] <
ml-node+s472066n3383905...@n3.nabble.com> wrote:
> Hi,
>
> The junk is appended here: buffer.append(termAtt.buffer());
>
> I assume you are on Lucene 3.1+, so use buffer.append(termAtt); termAtt
> im
Hi,
The junk is appended here: buffer.append(termAtt.buffer());
I assume you are on Lucene 3.1+, so use buffer.append(termAtt); termAtt
implements CharSequence, so it can be appended to any StringBuilder.
The code you are using appends the whole char array, which may contain
characters after term
Hi,
I am trying to write a TokenFilter which just concatenates all the the token
in the input TokenStream.
Issue I am facing is that my filter is outputting certain junk characters in
addition to the concatenated string. I believe this is caused by
StringBuilder.
This is my incrementToken() functi
thank you Ian
On Sep 30, 2011, at 4:19 AM, Ian Lea wrote:
> This all changed with the 3.1 release. See
> http://lucene.apache.org/java/3_1_0/changes/Changes.html#3.1.0.api_changes,
> number 18.
>
> You can get the old behaviour with StandardAnalyzer by passing
> VERSION_30, or you could look at
This all changed with the 3.1 release. See
http://lucene.apache.org/java/3_1_0/changes/Changes.html#3.1.0.api_changes,
number 18.
You can get the old behaviour with StandardAnalyzer by passing
VERSION_30, or you could look at UAX29URLEmailTokenizer which should
pick up the email component, althou