OK, so it does indeed look like a problem with your analyzer, as you suspected.
You could confirm that by using e.g. WhitespaceAnalyzer instead. Then maybe post the code for your custom analyzer, or step through in a debugger or however you prefer to debug code. -- Ian. On Wed, Apr 21, 2010 at 8:20 AM, jm <jmugur...@gmail.com> wrote: > I am using a TermQuery so no analyzer used... > protected static int getHitCount(Directory directory, String > fieldName, String searchString) throws IOException { > IndexSearcher searcher = new IndexSearcher(directory, true); //5 > Term t = new Term(fieldName, searchString); > Query query = new TermQuery(t); //6 > int hitCount = searcher.search(query, 1).totalHits; > searcher.close(); > return hitCount; > } > > Yes, I have written the index to disk, and luke shows the words > without the numbers... > > > On Tue, Apr 20, 2010 at 7:09 PM, Ian Lea <ian....@gmail.com> wrote: >> Are you using the same analyzer for searching, in your unshown >> getHitCount() method? >> >> There is lots of good advice in the FAQ under "Why am I getting no >> hits / incorrect hits?". And/or write the index to disk and use Luke >> to check that the correct content is being indexed. >> >> >> -- >> Ian. >> >> >> On Tue, Apr 20, 2010 at 4:58 PM, jm <jmugur...@gmail.com> wrote: >>> I am encountering a strange issue. I have a CustomStopAnalyzer. If I >>> do this (supporting code taken from AnalyzerUtils in LIA3 source code >>> Mike uploaded): >>> Analyzer customStopAnalyzer = new CustomStopAnalyzer(); >>> AnalyzerUtils.displayTokensWithFullDetails(customStopAnalyzer, >>> "mail77"); >>> >>> I get what I expect: >>> 1: [mail77:0->6:word] >>> >>> But when I am actually indexing docs, the word containing numbers >>> loose the numbers. >>> directory = new RAMDirectory(); >>> writer = new IndexWriter(directory, customStopAnalyzer, >>> IndexWriter.MaxFieldLength.UNLIMITED); >>> doc = new Document(); >>> doc.add((Fieldable) new Field("contents", "mail77", >>> Field.Store.NO, Field.Index.ANALYZED)); >>> writer.addDocument(doc); >>> writer.close(); >>> hitCount = getHitCount(directory, "contents", "mail77"); >>> System.out.println("mail77 " + hitCount); >>> >>> This writes >>> mail77 0 >>> If I look for "mail", I get one hit...I am using Lucene 3.0.1. Where >>> should I start looking (I assume in CustomStopAnalyzer but the fact >>> that displayTokensWithFullDetails() shows the right output puzzles >>> me)?? >>> >>> thanks >>> javier >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >>> For additional commands, e-mail: java-user-h...@lucene.apache.org >>> >>> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> For additional commands, e-mail: java-user-h...@lucene.apache.org >> >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org