If this is primarily an issue with the document input, as opposed to
queries, you might be better off simply preprocessing the text before it is
given to Lucene to be indexed.
-- Jack Krupansky
-----Original Message-----
From: Furkan KAMACI
Sent: Wednesday, February 26, 2014 1:37 PM
To: java-user@lucene.apache.org
Subject: Re: How to delete a token that comes exactly after a token
Hi;
I'm parsing a wiki dump file. There are some special definitions. In
example:
link:km
so when I parse my text I have that tokens: "link" and "km". I want to
remove "link" and it is a stopword for my situation. However I want to
remove "km" too if km is followed by token of "link". If there is no such
an implementation I can implement a patch for it?
Thanks;
Furkan KAMACI
2014-02-26 17:36 GMT+02:00 Jack Krupansky <j...@basetechnology.com>:
Sounds like a custom filter.
Or maybe an option for stop filter or a specialization of stop filter.
Or maybe it could be even more generalized.
What are some practical example token sequences?
-- Jack Krupansky
-----Original Message----- From: Furkan KAMACI Sent: Wednesday, February
26, 2014 9:48 AM To: java-user@lucene.apache.org Subject: How to delete a
token that comes exactly after a token
Hi;
How can I delete a token that comes exactly after a token for
StopwordFilter?
Thanks;
Furkan KAMACI
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org