Are you using ISOLatin1AccentFilter ? []s,
Lucas Frare A. Teixeira [EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]> Tel: +55 11 3660.1622 - R3018 Vinicius Carvalho escreveu:
Hello there! I'm indexing documents using the BrazilianAnalyzer, and I've noticed that many words are not being indexed. I store and index the entire doc (I'm doing this in order to present the fragments on the results, don't know if its the best way, mostly on large docs, any ideas?). Well using luke to check the index I open the stored doc, and its contents contains 17 occurrences of the word "herança" for instance. But, there's no term for this word or it stemm version: "heranc", so searching for this word would not return a result for this document. I'm pretty sure I'm missing something on the indexing process: try { doc.add(new Field("contents",docText,Field.Store.YES,Field.Index.TOKENIZED,Field.TermVector.YES)); IndexWriter writer = new IndexWriter("/java/lucene/portal/cms",new BrazilianAnalyzer()); // gotta improve this latter writer.addDocument(doc); writer.close(); } So, why would these word (and others) not being indexed? Regards