Re: [GENERAL] text and bytea

2008-03-03 Thread Bruce Momjian
Tom Lane wrote: > "hernan gonzalez" <[EMAIL PROTECTED]> writes: > > test=# create view vchartest as > > select encode(convert_to(c,'LATIN9'),'escape') as c1 from chartest; > > Hmm. This isn't a very sensible combination that you've written here, > but I see the point: encode(..., 'escape') is br

Re: [GENERAL] text and bytea

2008-02-25 Thread Tom Lane
"hernan gonzalez" <[EMAIL PROTECTED]> writes: > test=# create view vchartest as > select encode(convert_to(c,'LATIN9'),'escape') as c1 from chartest; Hmm. This isn't a very sensible combination that you've written here, but I see the point: encode(..., 'escape') is broken in that it fails to con

Re: [GENERAL] text and bytea

2008-02-25 Thread Tom Lane
"hernan gonzalez" <[EMAIL PROTECTED]> writes: > The objetionable ones IMHO are decode()/encode(), which can > consume/produce a "non-utf8 string" (I mean, not the backend encoding) Huh? Those deal with bytea too --- in fact, they've got nothing at all to do with multibyte character representation

Re: [GENERAL] text and bytea

2008-02-25 Thread hernan gonzalez
Another example (Psotgresql 8.3.0, UTF-8 server/client encoding) test=# create table chartest ( c text); test=# insert into chartest (c) values ('¡Hasta mañana!'); test=# create view vchartest as select encode(convert_to(c,'LATIN9'),'escape') as c1 from chartest; test=# select c,octet_length(c

Re: [GENERAL] text and bytea

2008-02-25 Thread hernan gonzalez
> IMHO, the semantics of encode() and decode() are correct (the bridge > between bytea and text ... in the backend encoding; they should be the > only bridge), convert() is also ok (deals with bytes), but > convert_to() and convert_from() are dubious if not broken: they imply > texts in arbitrary e

Re: [GENERAL] text and bytea

2008-02-25 Thread Gregory Stark
"hernan gonzalez" <[EMAIL PROTECTED]> writes: > IMHO, the semantics of encode() and decode() are correct (the bridge > between bytea and text ... in the backend encoding; they should be the > only bridge), convert() is also ok (deals with bytes), but > convert_to() and convert_from() are dubious i

Re: [GENERAL] text and bytea

2008-02-25 Thread hernan gonzalez
> Umm, I think all you showed was that the to_ascii() function was > broken. Postgres knows exactly what encoding the string is in, the > backend encoding: in your case UTF-8. That would be fine, if it were true; then, one could assume that every postgresql function that returns a text gets ALW

Re: [GENERAL] text and bytea

2008-02-24 Thread Martijn van Oosterhout
On Fri, Feb 22, 2008 at 01:54:46PM -0200, hernan gonzalez wrote: > > It seems to me that postgres is trying to do as you suggest: text is > > characters and bytea is bytes, like in Java. > > But the big difference is that, for text type, postgresql knows "this > is a text" but doesnt know the en

Re: [GENERAL] text and bytea

2008-02-24 Thread hernan gonzalez
> It seems to me that postgres is trying to do as you suggest: text is > characters and bytea is bytes, like in Java. But the big difference is that, for text type, postgresql knows "this is a text" but doesnt know the encoding, as my example showed. This goes against the concept of "text vs byt

Re: [GENERAL] text and bytea

2008-02-22 Thread Alvaro Herrera
Martijn van Oosterhout escribió: > The most surprising this is that to_ascii won't accept a bytea. TBH the > whole to_ascii function seems somewhat half-baked. If what you're > trying to do is remove accents, there are perl functions around that do > that. Basically, the switch to a different norm

Re: [GENERAL] text and bytea

2008-02-22 Thread Martijn van Oosterhout
On Thu, Feb 21, 2008 at 02:34:15PM -0200, hernan gonzalez wrote: > (After dealing a while with this, and learning a little, I though of > post this as comment in the docs, but perhaps someone who knows better > can correct or clarify) It seems to me that postgres is trying to do as you suggest: te

[GENERAL] text and bytea

2008-02-21 Thread hernan gonzalez
(After dealing a while with this, and learning a little, I though of post this as comment in the docs, but perhaps someone who knows better can correct or clarify) = The issues of charset encodings