> The usual way to deal with this is to convert the J text from > S-JIS (which > will almost always cause problems) to either EUC-JP or UTF8 > encoding before > inserting it into the DB or otherwise messing with it. You can > then convert > it back to SJIS before sending it to the client. After reading this, I started thinking in terms of character sets and dug a little more, and lo-and-behold, I discovered that our installation of PostgreSQL was configured with "--enable-multibyte=EUC_JP". No wonder I'm having problems! Okay, I'm convinced. Now first I have to convert my existing data, which although sitting in a database that expects EUC, is actually SJIS-based text. I found the following series of bash commands in a Japanese mailing list archive - does it look like this will work for me? (It looks scary to just drop the whole database and hope that the .out file knows how to rebuild it with all the indexes, sequences, users, etc. in place - should I be nervous?) $ pg_dump -D dbname > db.out $ dropdb dbname $ createdb -E EUC_JP dbname $ export PGCLIENTENCODING=SJIS $ psql dbname < db.out $ export PGCLIENTENCODING=EUC_JP Regarding the user interface end, when I read the suggested solution of using jcode to convert everything in and out of the database, I thought, "That's tedious! Why not just use EUC on the web pages, and the whole system will be in sync?" But that seems to be almost as tedious. The Windows-based editor I normally use to input the Japanese text portions of my code (I do most of the work in vi on my Linux box, but I can't input the Japanese that way) reads and writes in Shift-JIS unless I use pre- and post-processing filters, and it seems that other Windows programs also favor Shift-JIS. I did a totally unofficial, very-small-data-sample survey of Japanese web sites, and it seems that in general, sites that deal with ordinary consumers (and likely are written on Microsoft machines) use Shift-JIS (even ones that I figure must use databases, like search engines and e-commerce), Linux-related sites use JIS, and PostgreSQL-related sites use EUC. I'm sure there's a grand story to explain how it got to be this messy, but for right now, I guess we have to live with all these different systems - apparently there is not one system that works nicely for all things, or else the others would gradually become obselete, right? Before I add jcode function calls for every piece of data I get in or out of the database, or convert all my web page text to EUC-JP (I haven't decided yet which approach is more work, or more of a problem to maintain), are there any other thoughts on this? For example, does someone know of one of the following: (a) a way to get the text-only console of a RedHat 6.1J box to actually display Japanese characters (if so, I not only wouldn't have to deal with the Windows box for editing, I could even read the output of queries in psql!), or (b) a text editor for Windows that can be configured to default to EUC, rather than having to remember to always select a filter to convert to and from Shift-JIS? Or on the flip side of the discussion, can anyone imagine pitfalls associated with having a web site that is half EUC (the PHP and Perl files that deal with the database) and half Shift-JIS (the static HTML pages that are written by other people in who-knows-what Windows-based tools)? Thanks, -------------------------------- Karen Ellrick S & C Technology, Inc. 1-21-35 Kusatsu-shinmachi Hiroshima 733-0834 Japan (from U.S. 011-81, from Japan 0) 82-293-2838 -------------------------------- ---------------------------(end of broadcast)--------------------------- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html