Re: Indexing Non-English text

2007-12-04 Thread Grant Ingersoll
FileReader is dependent on your local locale. http://wiki.apache.org/lucene-java/IndexingOtherLanguages has some useful tips. Essentially, you need to make sure you control the encodings at all input points of your application. Lucene will do the appropriate thing internally. On Dec 4, 2

Indexing Non-English text

2007-12-04 Thread Liaqat Ali
Hi, I m facing a problem while indexing a small .txt file with Lucene. The file which i want to index with lucene is in Urdu language (varient of Arabic and Persian). But the Index i get is in Unicode form, not in the real form (original Urdu text). This program works good for a file in Englis