Priem, Alexander said: > I recreated my entire database (luckily I keep scripts for > table/index/view > creation) and initdb-ed it using --lc-collate=C --encoding=UNICODE. In my > psqlODBC DSN settings I added "set client_encoding='LATIN9';" to the > Connect Settings and that solved all my problems regarding the > special characters.
Does anyone know what the effect of --lc-collate=C --encoding=UNICODE will be for sorts (and indexes?) when a multibyte unicode character is encountered? Is --lc-collate=C --encoding=UNICODE even valid? And if it's valid what unexpected nasties could it cause? Is it also true that if LC_COLLATE != 'C' that indexes cannot be used for LIKE comparisons (and is this also true for en_US.iso885915)? Our database is UNICODE with LC_COLLATE=en_US.iso885915. Does anyone know what the effect of someone storing a cyrillic/chinese or korean character is? (We are using JDBC with a webapp so all the unicode concerns are handled transparently, apparantly). When the data is extracted from the DB will it render correctly in the browser provided we send all responses encoded in UTF-8? Although http://www.postgresql.org/docs/7.4/interactive/charset.html describes Postgres specific implementation and "how to configure for" a given locale - the subtle nuances of combinations of encoding and LC_COLLATE, and the tradeoffs are not entirely clear (to me at least). For example are the performance penalties of using UNICODE over ASCII significant? Maybe it's just my inexperience but this topic seems to cause lots of questions. A good/simple technote would be really useful... I'd do one but I really don't know my ass from my elbow around this topic (and probably many others too!). Thanks for any answers/feedback/more info. John Sidney-Woollett ---------------------------(end of broadcast)--------------------------- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match