Hello,
On 2014-02-26 19:37, Furkan KAMACI wrote:
[...] If there is no such
an implementation I can implement a patch for it?
If you "really" want to implement this in Lucene/Java I guess you should
have a look at existing token filters in:
lucene/analysis/common/src/java/org/apache/lucene/
If this is primarily an issue with the document input, as opposed to
queries, you might be better off simply preprocessing the text before it is
given to Lucene to be indexed.
-- Jack Krupansky
-Original Message-
From: Furkan KAMACI
Sent: Wednesday, February 26, 2014 1:37 PM
To: java
Hi;
I fixed the problems. StopFilter was not working as accepted because of
letter cases. I've changed the flags of WordDelimiter. Also I've
changed TokenStream
to TokenFilter.
Thanks;
Furkan KAMACI
2014-02-26 20:05 GMT+02:00 Furkan KAMACI :
> Hi;
>
> I have impelented that custom Analyzer:
>
Hi;
I'm parsing a wiki dump file. There are some special definitions. In
example:
link:km
so when I parse my text I have that tokens: "link" and "km". I want to
remove "link" and it is a stopword for my situation. However I want to
remove "km" too if km is followed by token of "link". If there i
Hi;
I have impelented that custom Analyzer:
public class DisambiguatorAnalyzer extends Analyzer {
Version version = Version.LUCENE_46;
List stopWordList;
public DisambiguatorAnalyzer(List stopWordList) throws
IOException {
super();
this.stopWordList = stopWordList;
}
Sounds like a custom filter.
Or maybe an option for stop filter or a specialization of stop filter.
Or maybe it could be even more generalized.
What are some practical example token sequences?
-- Jack Krupansky
-Original Message-
From: Furkan KAMACI
Sent: Wednesday, February 26, 201
February 2014, Apache Lucene⢠4.7 available
The Lucene PMC is pleased to announce the release of Apache Lucene 4.7
Apache Lucene is a high-performance, full-featured text search engine
library written entirely in Java. It is a technology suitable for nearly
any application that requires full-text
Hi;
How can I delete a token that comes exactly after a token for
StopwordFilter?
Thanks;
Furkan KAMACI