I have an index in Spanish and I use Snowball to stem and analyze and it works perfectly. However, I am running into trouble storing (not indexing, only storing) words that have special characters.
That is, I store the special character but the it comes garbled when I read it back. To provide an example: String content = "niños"; document.add(new Field("name",content,Store.YES, Index.Tokenized)); writer.addDocument(doc, new SnowballAnalyzer("Spanish")); . When I read the field back String nombre = doc.get("name"); Then name will contain "ni�os" Looking at the index with Luke it shows me "ni�os" but when I want to see the full text (by right clicking) it shows me ni�os. I know Lucene is supposed to store fields in UTF8, but then, how can I make sure I sotre something and get it back just as it was, including special characters? Thanks -- Juan Pablo Morales Ingenian Software ltda Bogotá, Colombia