The only reason the TS stuff needs an encoding spec is to figure out how
to read an external stop word file.  I think my suggestion upthread is a
lot better: have just one stop word file per language, store them all in
UTF8, and convert to database encoding when loading them.  The database

Hmm. You mean to use language name in configuration, use current encoding to
define which dictionary should be used (stemmers for the same language are different for different encoding) and recode dictionaries file from UTF8 to current locale. Did I understand you right?

That's possible to do. But it's incompatible changes and cause some difficulties for DBA. If server locale is ISO (or KOI8 or any other) and file is in UTF8 then text editor/tools might be confused.


--
Teodor Sigaev                                   E-mail: [EMAIL PROTECTED]
                                                   WWW: http://www.sigaev.ru/

---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at

               http://www.postgresql.org/about/donate

Reply via email to