Tom Lane wrote:
> "hernan gonzalez" <[EMAIL PROTECTED]> writes:
> > test=# create view vchartest as
> > select encode(convert_to(c,'LATIN9'),'escape') as c1 from chartest;
>
> Hmm. This isn't a very sensible combination that you've written here,
> but I see the point: encode(..., 'escape') is br
"hernan gonzalez" <[EMAIL PROTECTED]> writes:
> test=# create view vchartest as
> select encode(convert_to(c,'LATIN9'),'escape') as c1 from chartest;
Hmm. This isn't a very sensible combination that you've written here,
but I see the point: encode(..., 'escape') is broken in that it fails
to con
"hernan gonzalez" <[EMAIL PROTECTED]> writes:
> The objetionable ones IMHO are decode()/encode(), which can
> consume/produce a "non-utf8 string" (I mean, not the backend encoding)
Huh? Those deal with bytea too --- in fact, they've got nothing at
all to do with multibyte character representation
Another example (Psotgresql 8.3.0, UTF-8 server/client encoding)
test=# create table chartest ( c text);
test=# insert into chartest (c) values ('¡Hasta mañana!');
test=# create view vchartest as
select encode(convert_to(c,'LATIN9'),'escape') as c1 from chartest;
test=# select c,octet_length(c
> IMHO, the semantics of encode() and decode() are correct (the bridge
> between bytea and text ... in the backend encoding; they should be the
> only bridge), convert() is also ok (deals with bytes), but
> convert_to() and convert_from() are dubious if not broken: they imply
> texts in arbitrary e
"hernan gonzalez" <[EMAIL PROTECTED]> writes:
> IMHO, the semantics of encode() and decode() are correct (the bridge
> between bytea and text ... in the backend encoding; they should be the
> only bridge), convert() is also ok (deals with bytes), but
> convert_to() and convert_from() are dubious i
> Umm, I think all you showed was that the to_ascii() function was
> broken. Postgres knows exactly what encoding the string is in, the
> backend encoding: in your case UTF-8.
That would be fine, if it were true; then, one could assume that every
postgresql function that returns a text gets ALW
On Fri, Feb 22, 2008 at 01:54:46PM -0200, hernan gonzalez wrote:
> > It seems to me that postgres is trying to do as you suggest: text is
> > characters and bytea is bytes, like in Java.
>
> But the big difference is that, for text type, postgresql knows "this
> is a text" but doesnt know the en
> It seems to me that postgres is trying to do as you suggest: text is
> characters and bytea is bytes, like in Java.
But the big difference is that, for text type, postgresql knows "this
is a text"
but doesnt know the encoding, as my example showed. This goes against
the concept of "text vs byt
Martijn van Oosterhout escribió:
> The most surprising this is that to_ascii won't accept a bytea. TBH the
> whole to_ascii function seems somewhat half-baked. If what you're
> trying to do is remove accents, there are perl functions around that do
> that. Basically, the switch to a different norm
On Thu, Feb 21, 2008 at 02:34:15PM -0200, hernan gonzalez wrote:
> (After dealing a while with this, and learning a little, I though of
> post this as comment in the docs, but perhaps someone who knows better
> can correct or clarify)
It seems to me that postgres is trying to do as you suggest: te
(After dealing a while with this, and learning a little, I though of
post this as comment in the docs, but perhaps someone who knows better
can correct or clarify)
=
The issues of charset encodings
12 matches
Mail list logo