I did search google on TokenFilter lucene example and found this link http://sujitpal.blogspot.com/2011/07/lucene-token-concatenating-tokenfilter_ 30.html which seems to override incrementToken() ( guess as I don't know java ) however using lucene.net 3.0.3, I can override public override Token Next(Token result) public override Token Next() but not able to figure out how to proceed there, I tried to debug using public override Token Next(Token result) { Debug.WriteLine(string.Format(" --- {0}", result)); return result; } But went nowhere with that, any help on how to write my custom tokenFilter()
Also, The analyzer I have is setup as below without the use of ReusableTokenStream() per the example in your link, not sure if that makes a difference ?? class MyAnalyzer : Analyzer { public override TokenStream TokenStream(string fieldName, System.IO.TextReader reader) { TokenStream result = new WhitespaceTokenizer(reader); result = new LowerCaseFilter(result); result = new StandardFilter(result); result = new StopFilter(true, result, StopAnalyzer.ENGLISH_STOP_WORDS_SET); return result; } } -----Original Message----- From: Naresh [mailto:nnar...@gmail.com] Sent: Monday, February 25, 2013 1:18 AM To: java-user@lucene.apache.org Subject: Re: Searching for keywords .net,c#,... Hi, You can write your own token-filter to split on some characters (comma, | etc.,) and then build an analyzer using the WhiteSpaceTokenizer, LowerCaseFilter and your CustomTokenFilter. See http://stackoverflow.com/questions/9015348/lucene-custom-analyzer/9015658#90 15658 On Mon, Feb 25, 2013 at 11:24 AM, kumar <x10...@gmail.com> wrote: > Hello all > > I am a lucene novice and trying to setup lucene in a .net app using > lucene.net for searching through documents So far it has been > fantastic, however given that the users expectations are for > "google"-like search, running into issues searching for .net and c# > > Initially tried the StandardAnalyzer which of course does not work for > searching - .net & c# > Changed that to a custom analyzer using WhitespaceTokenizer and > LowerCaseFilter and it works > however some of the documents have the keywords as > > oracle,.net,C#,java etc. ( i.e. separated by commas without any space > ) > > and this custom analyzer fails here > > Looking for suggestions on how this might work as i'm sure it's > possible considering both lucene and .net/c# have been around for a > long long while > > It looks like PatternAnalyzer might be of some use in this case, > however i'm not quite sure how to use it and have found scant > references to it > > > Any help is appreciated > > Thanks > kumar > > -- Regards Naresh --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org