Can I detect incorrect language selection after creating an index?

Ilya Zavorin Mon, 27 Feb 2012 07:54:17 -0800

Suppose I have a bunch of text documents in language X but I index ithem using 
an analyzer for language Y. Once the index is created, is it possible to 
perform some sort of simple "sanity" check to see if the original language 
selection was wrong? I presume I can try searching for some common word in 
language Y, but I am not sure how reliable this would be. On the other hand, if 
languages are from the same group, say X and Y are English and Spanish, I 
should expect that this sanity check would produce a false match. However, I 
would be happy if it worked reliably enough for languages using different 
scripts, e.g. Latin vs Cyrillic vs Arabic vs Chinese etc.



Thanks much



Ilya Zavorin

Can I detect incorrect language selection after creating an index?

Reply via email to