[BUGS] BUG #1721: mutiple bytes character string comaprison error
The following bug has been logged online: Bug reference: 1721 Logged by: Chii-Tung Liu Email address: [EMAIL PROTECTED] PostgreSQL version: 8.0.3 Operating system: Windows XP SP2 Description:mutiple bytes character string comaprison error Details: When compare two UTF-8 encoded string that contains Chinese words, the result is always TRUE 1. create a database test with encoding set to unicode CREATE DATABASE test WITH OWNER = postgres ENCODING = 'UNICODE' TABLESPACE = pg_default; 2. insert data with Chinese words INSERT into node set title='1 䏿??' 3. SELECT title from node where title > '1.1 ' would return '1 䏿??' 4. Both SELECT '1 䏿??' > '1.1' and SELECT '1.1' > '1 䏿??' return FALSE ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[BUGS] B-tree unique index duplicate key error happens only in SUSE 9.3
This bug happens in SUSE 9.3 on both Pentium 4 and AMD64, whether the binaries are from postgresql-8.0.1 RPMs on the SUSE 9.3 DVD or are built from 8.0.3 source code. However this bug does NOT happen with a Debian box (unstable) running 8.0.3 on an x86 (Athlon XP, whether binary or built from source). The problem is Postgresql claims two records has the same value for one string attribute that has a unique constraint, while in fact the two string values are different. To see this bug, just do a restore from the pg_dump'ed file attached to this email. Sample steps and error message follow: begin command --- createdb -E utf8 pg_bug psql pg_bug < pg_dup_key_bug.dump NOTICE: CREATE TABLE / UNIQUE will create implicit index "gaocanusers_userid_key" for table "gaocanusers" CREATE TABLE ERROR: duplicate key violates unique constraint "gaocanusers_userid_key" CONTEXT: COPY gaocanusers, line 2: "129406 ���ズ� [EMAIL PROTECTED] f U\N \N \N \N -- f 2002-09-12 00:00:00 \N \\3031\\3..." --- end pg_dup_key_bug.dump Description: Binary data ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [BUGS] B-tree unique index duplicate key error happens only in SUSE 9.3
FYI, Works just fine on gentoo with the UTF8 and ICU patches. ... John > This bug happens in SUSE 9.3 on both Pentium 4 and AMD64, > whether the binaries are from postgresql-8.0.1 RPMs on the > SUSE 9.3 DVD or are built from 8.0.3 source code. However > this bug does NOT happen with a Debian box (unstable) running > 8.0.3 on an x86 (Athlon XP, whether binary or built from > source). The problem is Postgresql claims two records has the > same value for one string attribute that has a unique > constraint, while in fact the two string values are > different. To see this bug, just do a restore from the > pg_dump'ed file attached to this email. Sample steps and > error message follow: ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [BUGS] B-tree unique index duplicate key error happens only in SUSE 9.3
Zhenlei Cai <[EMAIL PROTECTED]> writes: > This bug happens in SUSE 9.3 on both Pentium 4 and AMD64, whether the > binaries are from postgresql-8.0.1 RPMs on the SUSE 9.3 DVD or are > built from 8.0.3 source code. However this bug does NOT happen with a > Debian box (unstable) running 8.0.3 on an x86 (Athlon XP, whether > binary or built from source). The problem is Postgresql claims two What makes you think this is a Postgres bug, rather than a bug in the locale definition you are using on the SUSE box? Try feeding the two strings in question to strcoll() and see what happens. One way that you can get inconsistent results from strcoll() is if you feed it strings that are invalid according to the character set encoding that strcoll() thinks you are using, which is to say the encoding implied by the current LC_CTYPE locale setting. So it's possible that the real problem is that you have Postgres' database encoding set to something that's incompatible with the postmaster's LC_CTYPE locale. (Try "show lc_ctype" to see what that is exactly.) regards, tom lane ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [BUGS] BUG #1721: mutiple bytes character string comaprison error
"Chii-Tung Liu" <[EMAIL PROTECTED]> writes: > PostgreSQL version: 8.0.3 > Operating system: Windows XP SP2 > When compare two UTF-8 encoded string that contains Chinese words, the > result is always TRUE Sorry, but UTF-8 encoding doesn't work properly on Windows (yet). Use some other database encoding. regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [BUGS] BUG #1721: mutiple bytes character string comaprison
On Sun, 19 Jun 2005, Tom Lane wrote: > "Chii-Tung Liu" <[EMAIL PROTECTED]> writes: > > PostgreSQL version: 8.0.3 > > Operating system: Windows XP SP2 > > > When compare two UTF-8 encoded string that contains Chinese words, the > > result is always TRUE > > Sorry, but UTF-8 encoding doesn't work properly on Windows (yet). > Use some other database encoding. > Shouldn't we forbid its creation then? At least a strongly worded warning? We see these complaints too often. Kris Jurka ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [BUGS] BUG #1721: mutiple bytes character string comaprison error
Kris Jurka <[EMAIL PROTECTED]> writes: > On Sun, 19 Jun 2005, Tom Lane wrote: >> Sorry, but UTF-8 encoding doesn't work properly on Windows (yet). >> Use some other database encoding. > Shouldn't we forbid its creation then? There was serious discussion of that before the 8.0 release, but we decided not to forbid it. Check the archives; I don't recall the reasoning at the moment. > We see these complaints too often. There are lots of complaints we see way too often ;-) ... but distressingly, there are still only 24 hours in a day. regards, tom lane ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [BUGS] BUG #1721: mutiple bytes character string comaprison
> The following bug has been logged online: > > Bug reference: 1721 > Logged by: Chii-Tung Liu > Email address: [EMAIL PROTECTED] > PostgreSQL version: 8.0.3 > Operating system: Windows XP SP2 > Description:mutiple bytes character string comaprison error > Details: > > When compare two UTF-8 encoded string that contains Chinese words, the > result is always TRUE > 1. create a database test with encoding set to unicode > CREATE DATABASE test > WITH OWNER = postgres >ENCODING = 'UNICODE' >TABLESPACE = pg_default; > 2. insert data with Chinese words > INSERT into node set title='1 中文' > > 3. SELECT title from node where title > '1.1 ' > would return '1 中文' > > 4. Both SELECT '1 中文' > '1.1' and SELECT '1.1' > '1 中文' return > FALSE I think you need to use C locale. -- Tatsuo Ishii ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly