OK, so it does indeed look like a problem with your analyzer, as you suspected.

You could confirm that by using e.g. WhitespaceAnalyzer instead.  Then
maybe post the code for your custom analyzer, or step through in a
debugger or however you prefer to debug code.


--
Ian.


On Wed, Apr 21, 2010 at 8:20 AM, jm <jmugur...@gmail.com> wrote:
> I am using a TermQuery so no analyzer used...
> protected static int getHitCount(Directory directory, String
> fieldName, String searchString) throws IOException {
>        IndexSearcher searcher = new IndexSearcher(directory, true); //5
>        Term t = new Term(fieldName, searchString);
>        Query query = new TermQuery(t); //6
>        int hitCount = searcher.search(query, 1).totalHits;
>        searcher.close();
>        return hitCount;
> }
>
> Yes, I have written the index to disk, and luke shows the words
> without the numbers...
>
>
> On Tue, Apr 20, 2010 at 7:09 PM, Ian Lea <ian....@gmail.com> wrote:
>> Are you using the same analyzer for searching, in your unshown
>> getHitCount() method?
>>
>> There is lots of good advice in the FAQ under "Why am I getting no
>> hits / incorrect hits?".  And/or write the index to disk and use Luke
>> to check that the correct content is being indexed.
>>
>>
>> --
>> Ian.
>>
>>
>> On Tue, Apr 20, 2010 at 4:58 PM, jm <jmugur...@gmail.com> wrote:
>>> I am encountering a strange issue. I have a CustomStopAnalyzer. If I
>>> do this (supporting code taken from AnalyzerUtils in LIA3 source code
>>> Mike uploaded):
>>>        Analyzer customStopAnalyzer = new CustomStopAnalyzer();
>>>        AnalyzerUtils.displayTokensWithFullDetails(customStopAnalyzer,
>>> "mail77");
>>>
>>> I get what I expect:
>>> 1: [mail77:0->6:word]
>>>
>>> But when I am actually indexing docs, the word containing numbers
>>> loose the numbers.
>>>        directory = new RAMDirectory();
>>>        writer = new IndexWriter(directory, customStopAnalyzer,
>>> IndexWriter.MaxFieldLength.UNLIMITED);
>>>        doc = new Document();
>>>        doc.add((Fieldable) new Field("contents", "mail77",
>>> Field.Store.NO, Field.Index.ANALYZED));
>>>        writer.addDocument(doc);
>>>        writer.close();
>>>        hitCount = getHitCount(directory, "contents", "mail77");
>>>        System.out.println("mail77 " + hitCount);
>>>
>>> This writes
>>> mail77 0
>>> If I look for "mail", I get one hit...I am using Lucene 3.0.1. Where
>>> should I start looking (I assume in CustomStopAnalyzer but the fact
>>> that displayTokensWithFullDetails() shows the right output puzzles
>>> me)??
>>>
>>> thanks
>>> javier
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to