At 09:20 PM 8/24/2004 +0200, Peter Eisentraut wrote:
David Wheeler wrote:
> That's not the trouble so much as that the locales can be badly
If we always followed the principle "X could be broken, so let's not use
X", then we would never get anything done. Instead, "X is broken, so
fix it".
> broke
David Wheeler <[EMAIL PROTECTED]> writes:
>>> Hmm. I tried putting your string into a UNICODE database and I got
>>> ERROR: invalid byte sequence for encoding "UNICODE": 0xc7
>>
>> Really? Curious.
> Oh, are you sure that you got my UTF-8 data? Because it came back in
> your reply all mangled.
On Aug 24, 2004, at 12:20 PM, Peter Eisentraut wrote:
broken, and that they're useless for multilingual use.
I don't agree with that, but perhaps we differ in our interpretation of
"multilingual use". If you have special requirements, you can always
turn the locales off.
Well, we're getting beyond
On Aug 23, 2004, at 10:25 PM, Joel wrote:
If the locale machinery iw functioning correctly (and if I understand
correctly), there ought to be a setting that would allow those to
collate to the same point.
Bleh. There must be some distinction between them. It sounds like
querying for synonyms.
I'm
David Wheeler wrote:
> But given what you've said, Tatsuo, it makes me wonder if it's worth
> it to use the system locale default when running initdb?
Yes, because that is the locale that the user prefers. If a locale is
broken then you shouldn't set it as system locale in the first place.
--
On Tue, 24 Aug 2004 01:34:46 +0200
(BIan Barwick <[EMAIL PROTECTED]> wrote
(B
(B> ...
(B> wild speculation in need of a Korean speaker, but:
(B>
(B> [EMAIL PROTECTED]:~/tmp> cat j.txt
(B> $Bec,e$;ec(B
(B> $ByyPl%$%9wd!"(B
(B> $Bx"(l%$(Bl$B%i(B
(B> $Bw{%1v.%/wd(B
(B>
On Aug 23, 2004, at 6:49 PM, Tim Allen wrote:
One possible clue: your original post in this thread was using
encoding euc-kr, not unicode (utf-8). If your mailer was set to use
that encoding, perhaps your other client software is/was also?
Bah! Stupid Mail.app was trying to be too smart!
Thanks,
Tom Lane wrote:
David Wheeler <[EMAIL PROTECTED]> writes:
bric=3D# reindex index udx_keyword__name;
REINDEX
bric=3D# select * from keyword where name =3D'=BA=CF=C7=D1=C0=C7';
id | name | screen_name | sort_name | active
--++-+---+
1218 | =B1=B9=B9=E6=BA
On Aug 23, 2004, at 5:22 PM, Tatsuo Ishii wrote:
Locales for multibyte encodings are often broken on many platforms. I
see identical things with Japanese on Red Hat. This is one of the
reason why I tell Japanese PostgreSQL users not to enable locale while
initdb...
Yep, and exporting my data, delet
> >
> > Ð ÐÐÐ, 23.08.2004, Ð 23:04, David Wheeler ÐÐÑÐÑ:
> > > On Aug 23, 2004, at 1:58 PM, Ian Barwick wrote:
> > >
> > > > er, the characters in "name" don't seem to match the characters in the
> > > > query - 'êëë' vs. 'ëíì' - does that have any bearing?
> > >
> > > Yes, it means that = is doin
On Aug 23, 2004, at 5:07 PM, Ian Barwick wrote:
Does this go away if you change your locale to C?
Yes.
Hallelujah! I'm running initdb again now.
Cheers,
David
smime.p7s
Description: S/MIME cryptographic signature
On Mon, 23 Aug 2004 16:50:04 -0700, David Wheeler <[EMAIL PROTECTED]> wrote:
> On Aug 23, 2004, at 4:34 PM, Ian Barwick wrote:
>
> > wild speculation in need of a Korean speaker, but:
> >
> > [EMAIL PROTECTED]:~/tmp> cat j.txt
> > ããã
> > íêì
> > ìêì
> > ìëì
> > êëë
> > ëíì
> > ããã
> > [EMAIL PROT
On Aug 23, 2004, at 4:49 PM, David Wheeler wrote:
Hmm. I tried putting your string into a UNICODE database and I got
ERROR: invalid byte sequence for encoding "UNICODE": 0xc7
Really? Curious.
Oh, are you sure that you got my UTF-8 data? Because it came back in
your reply all mangled.
Cheers,
Da
On Aug 23, 2004, at 4:34 PM, Ian Barwick wrote:
wild speculation in need of a Korean speaker, but:
[EMAIL PROTECTED]:~/tmp> cat j.txt
テスト
환경설
전검색
웹문서
국방비
북한의
てすと
[EMAIL PROTECTED]:~/tmp> uniq j.txt
テスト
환경설
てすと
All but the first and last lines are random Korean (Hangul)
characters. Evidently our re
On Aug 23, 2004, at 4:35 PM, Tom Lane wrote:
Hmm. I tried putting your string into a UNICODE database and I got
ERROR: invalid byte sequence for encoding "UNICODE": 0xc7
Really? Curious.
So there's something funny happening here. What is your
client_encoding
setting?
It's not set. I've had it c
David Wheeler <[EMAIL PROTECTED]> writes:
>> Is the problem query using an index? If so, does REINDEX help?
> Doesn't look like it:
> bric=3D# reindex index udx_keyword__name;
> REINDEX
> bric=3D# select * from keyword where name =3D'=BA=CF=C7=D1=C0=C7';
>id | name | screen_name | sort_na
On Tue, 24 Aug 2004 00:46:50 +0200, Markus Bertheau
<[EMAIL PROTECTED]> wrote:
>
>
> Ð ÐÐÐ, 23.08.2004, Ð 23:04, David Wheeler ÐÐÑÐÑ:
> > On Aug 23, 2004, at 1:58 PM, Ian Barwick wrote:
> >
> > > er, the characters in "name" don't seem to match the characters in the
> > > query - 'êëë' vs. 'ëíì'
On Aug 23, 2004, at 4:08 PM, Tom Lane wrote:
[ looks back at discussion... ] Actually I misremembered --- the
discussion was about how we would *reject* legal UTF-8 codes that are
more than 2 bytes long. So the code is broken, but not in the
direction
that would cause your problem. Time for ano
David Wheeler <[EMAIL PROTECTED]> writes:
> Is the encoding check fixed in 8.0beta1?
[ looks back at discussion... ] Actually I misremembered --- the
discussion was about how we would *reject* legal UTF-8 codes that are
more than 2 bytes long. So the code is broken, but not in the direction
that
On Aug 23, 2004, at 3:59 PM, Tom Lane wrote:
But is it possible to store non-UTF-8 data in a UNICODE database?
In theory not ... but I think there was a discussion earlier that
concluded that our check for encoding validity is not airtight ...
Well, it it was mostly right, I wouldn't expect it to b
David Wheeler <[EMAIL PROTECTED]> writes:
> But is it possible to store non-UTF-8 data in a UNICODE database?
In theory not ... but I think there was a discussion earlier that
concluded that our check for encoding validity is not airtight ...
regards, tom lane
---
On Aug 23, 2004, at 3:46 PM, Markus Bertheau wrote:
The collation rules of your (and my) locale say that these strings are
the same:
[EMAIL PROTECTED] markus]$ cat > t
국방비
북한의
[EMAIL PROTECTED] markus]$ uniq t
국방비
[EMAIL PROTECTED] markus]$
Interesting.
Make sure that you have initdb'd the database
22 matches
Mail list logo