Re: [SQL] Understanding Encoding
> Hello All,
>
> I am not able to understand how the encoding is handled. I would be happy
> if someone can tell what is happening in the following scenario:
>
> 1. I have created a database with EUC_KR encoding and created a table and
> inserted some korean value into it.
>
> =# CREATE DATABASE korean WITH ENCODING 'EUC_KR' LC_COLLATE='ko_KR.euckr'
> LC_CTYPE='ko_KR.euckr' TEMPLATE=template0;
>
> =# \c korean
>
> korean=# SHOW client_encoding;
> client_encoding
> -
> UTF8
> (1 row)
>
> korean=# CREATE TABLE tbl (doc text);
>
> korean=# INSERT INTO tbl VALUES ('그레스');
>
>
> 2. If I insert non-korean values it throws error:
>
> korean=# INSERT INTO tbl VALUES ('データベース');
> ERROR: character with byte sequence 0xe3 0x83 0xbc in encoding "UTF8" has
> no equivalent in encoding "EUC_KR"
The error messages says all. PostgreSQL accepted 'データベース'
encoded in UTF-8 then tried to convert to EUC_KR but failed, because
EUC_KR does not accept languages other than Korean (and ASCII). What
else did you expect?
> korean=# SELECT * FROM tbl;
> doc
>
> 그레스
> (1 row)
>
>
> 3. I change the client encoding to EUC_KR and try inserting the same korean
> characters and it throws an error:
>
> korean=# SET client_encoding = 'EUC_KR';
> SET
> korean=# INSERT INTO tbl VALUES ('그레스');
> ERROR: invalid byte sequence for encoding "EUC_KR": 0xa0 0x88
0xa0 is definitely not part of EUC_KR. That's why PostgreSQL throws an
error. I gues you are using UHC (Unified Hangul Code), rather than
EUC_KR. They are different encodings. You should do either:
1) Make sure that your termical encoding is EUC_KR.
2) set client_encoding = 'uhc';
> Even the SELECT statement displays something different. I am not able to
> understand why?
>
> korean=# SELECT * FROM tbl;
> doc
>
> ����
> (1 row)
This is because the same reason above.
> Can someone please help me.
>
> Thanks you,
>
> Beena Emerson
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp
--
Sent via pgsql-sql mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql
Re: [SQL] [NOVICE] Understanding Encoding
On Fri, Sep 6, 2013 at 3:47 PM, Beena Emerson wrote:
>
>>
>> I wonder if you have tried changing your "locale" to ko_KR; something
>> like:
>>
>> LANG=ko_KR LC_ALL=ko_KR \
>> psql -d korean
>>
>
> Hi,
>
> It still gives same result:
>
> $ LANG=ko_KR LC_ALL=ko_KR
> $ psql -d korean
>
> korean=# SHOW client_encoding;
> client_encoding
> -
> EUC_KR
> (1 row)
>
> korean=# INSERT INTO tbl VALUES ('그레스');
> ERROR: invalid byte sequence for encoding "EUC_KR": 0xa0 0x88
I changed the encoding of the terminal emulator (GNOME Terminal
2.31.3) using the Terminal menu as:
Terminal -> Set Character Encoding -> Korean (EUC-KR)
Note that, if the menu only lists UTF-8, you'd have to add EUC-KR
using "Add or Remove".
And it seems to work; could you try the same?
--
Amit Langote
--
Sent via pgsql-sql mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql
Re: [SQL] [NOVICE] Understanding Encoding
On Fri, Sep 6, 2013 at 12:29 PM, Tom Lane wrote:
> Beena Emerson writes:
> > It still gives same result:
>
> > $ LANG=ko_KR LC_ALL=ko_KR
> > $ psql -d korean
>
> > korean=# SHOW client_encoding;
> > client_encoding
> > -
> > EUC_KR
> > (1 row)
>
> > korean=# INSERT INTO tbl VALUES ('그레스');
> > ERROR: invalid byte sequence for encoding "EUC_KR": 0xa0 0x88
>
> What you need to figure out is what encoding the text you are typing
> is in. You're telling psql it's EUC_KR but it evidently isn't.
> If you're typing these characters manually then it's probably determined
> by a setting of the terminal-emulator program you're using. But if
> you're copying-and-pasting then things get more complicated.
>
> Also, what you did above is not what Amit suggested: he wanted you to put
> the variable assignments on the same command line as the psql invocation,
> so that they'd affect the environment passed to psql. I'm suspicious of
> his solution because I'd have thought the terminal program would set up
> the right environment ... but you might as well try it.
>
I tried with both the assignment and invocation in same line. Again it gave
the same result.
Maybe the problem is with copy paste. I will look into it.
Thank you.
Re: [SQL] Understanding Encoding
Hi,
Tip:
To identify what encoding you enter in the psql command interpreter:
1) Open a file with vim
2) Type in you SQL or copy/paste
3) Save the file and quit vim
4) $ file
Should give you the encoding of that text file.
For ex:
sf@orca:~$ echo $LC_ALL
en_US.UTF-8
sf@orca:~$ cat /tmp/xx
abcdefé
sf@orca:~$ file /tmp/xx
/tmp/xx: UTF-8 Unicode text
Seb
On 09/06/2013 09:03 AM, Tatsuo Ishii wrote:
Hello All,
I am not able to understand how the encoding is handled. I would be happy
if someone can tell what is happening in the following scenario:
1. I have created a database with EUC_KR encoding and created a table and
inserted some korean value into it.
=# CREATE DATABASE korean WITH ENCODING 'EUC_KR' LC_COLLATE='ko_KR.euckr'
LC_CTYPE='ko_KR.euckr' TEMPLATE=template0;
=# \c korean
korean=# SHOW client_encoding;
client_encoding
-
UTF8
(1 row)
korean=# CREATE TABLE tbl (doc text);
korean=# INSERT INTO tbl VALUES ('그레스');
2. If I insert non-korean values it throws error:
korean=# INSERT INTO tbl VALUES ('データベース');
ERROR: character with byte sequence 0xe3 0x83 0xbc in encoding "UTF8" has
no equivalent in encoding "EUC_KR"
The error messages says all. PostgreSQL accepted 'データベース'
encoded in UTF-8 then tried to convert to EUC_KR but failed, because
EUC_KR does not accept languages other than Korean (and ASCII). What
else did you expect?
korean=# SELECT * FROM tbl;
doc
그레스
(1 row)
3. I change the client encoding to EUC_KR and try inserting the same korean
characters and it throws an error:
korean=# SET client_encoding = 'EUC_KR';
SET
korean=# INSERT INTO tbl VALUES ('그레스');
ERROR: invalid byte sequence for encoding "EUC_KR": 0xa0 0x88
0xa0 is definitely not part of EUC_KR. That's why PostgreSQL throws an
error. I gues you are using UHC (Unified Hangul Code), rather than
EUC_KR. They are different encodings. You should do either:
1) Make sure that your termical encoding is EUC_KR.
2) set client_encoding = 'uhc';
Even the SELECT statement displays something different. I am not able to
understand why?
korean=# SELECT * FROM tbl;
doc
����
(1 row)
This is because the same reason above.
Can someone please help me.
Thanks you,
Beena Emerson
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp
--
Sent via pgsql-sql mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql
Re: [SQL] Understanding Encoding
Hello, Thank you all. Amit, Changing the encoding of the terminal emulator worked. Sebastiean, the tip was helpful. -- Beena Emerson
