That was an easy fix. Everything works as expected now. Thanks again.
-Original Message-
From: Uwe Schindler [mailto:u...@thetaphi.de]
Sent: Thursday, December 05, 2013 1:46 PM
To: java-user@lucene.apache.org
Subject: RE: Analyzers aren't reusable?? (lucene 4.2.1)
The problem is the Ch
Thanks for the quick response. I'll read through the references.
Thanks again
Scott
-Original Message-
From: Uwe Schindler [mailto:u...@thetaphi.de]
Sent: Thursday, December 05, 2013 1:46 PM
To: java-user@lucene.apache.org
Subject: RE: Analyzers aren't reusable?? (lucene 4.2.1)
The p
The problem is the CharFilter, which cannot be reused. To correctly implement
the Analyzer do the wrapping of the incoming Reader in the protected
initReader():http://lucene.apache.org/core/4_6_0/core/org/apache/lucene/analysis/Analyzer.html#initReader(java.lang.String,
java.io.Reader). In creat
I wrote the following to demonstrate what for me was surprising behavior (this
is Lucene 4.2.1). If you want to run this yourself, you should be able to
since there are no references to anything other than standard lucene and java
libraries. Basically, this is an analyzer that makes everything
You can also string together one of a myriad of TokenFilters, see:
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters
I'd recommend spending some time on the admin/analysis page
to understand what all the combinations do. I'd also recommend
against dealing with punctuation etc by using wi
Hi;
Standard tokenizer includes of that bydefault:
StandardFilter, LowerCaseFilter and StopFilter
You can consider char filters. Did you read here:
https://cwiki.apache.org/confluence/display/solr/CharFilterFactories
Thanks;
Furkan KAMACI
2013/12/5
> Hi,
>
> I have used StandardAnalyzer in
Hi,
I have used StandardAnalyzer in my code and it is working fine. One of the
challenges that I face is the fact that, this Analyzer by default tokenizes on
some special characters such as hyphen, apart from the SPACE character.
I want to tokenize only on the SPACE character. Could you please