On Tue, 2006-10-31 at 23:18 -0500, Tom Lane wrote: > Jeff Davis <[EMAIL PROTECTED]> writes: > > Is this not a bug? > > I don't actually see that it is. The documentation is perfectly clear > on the point: > > (It is your responsibility that the byte sequences you create > are valid characters in the server character set encoding.) > > (This is in 4.1.2.1. String Constants) If you don't want to deal with > this, don't use octal escapes to construct multibyte characters. >
I have thought about this some more, and I still disagree on a few points. First, there is no way, as a DBA, to _not_ use octal escapes, because there is no option to turn off. I can either inspect the application, or just trust it, but I can't control it from the database (short of adding a CHECK to every text column). Since it interferes with the proper use of COPY, which in turn interferes with the use of pg_dump and applications such as Slony, I think that it is something that should be controlled at database. Second, you pointed out that my patch breaks BYTEA. I'd like to point out that (as I understand it) the patch only breaks BYTEA if you rely on the cstring escaping to pass binary data to byteain, like: => SELECT E'\377'::bytea; In my opinion, that's wrong anyway, because it will fail if you do something like: => SELECT E'\377\000\377'; Because the NULL in the middle terminates the cstring, which passes a 1- byte string to byteain, resulting in a 1-byte value, instead of a 3-byte value which it should be. However, if you pass ASCII data to byteain that represents binary data, like: => SELECT E'\\377\\000\\377'::bytea; It seems to work just fine, and that's what PQescapeByteaConn() does. There may be a backwards-compatibility issue, but I don't think my patch is 100% broken (unless, of course, you found some other way that it's broken). This is a practical concern because some applications use escaping that protects against SQL injection, but does not protect against invalid byte sequences. It's not obvious to an application programmer that protecting against SQL injection is not enough, particularly if the programmer is trying to be "database agnostic" or doesn't test against different encodings. Ideally, they should use PQescapeStringConn() and not convert anything to an octal escape, but that's not always the case. Lastly, if we retain the current behavior, I think a note should be added to the COPY docs. The current docs imply that if you COPY it out, you can COPY it back in. Even if I read the above documentation note directly after reading the COPY documentation, it would not make me think that COPY could fail for built-in types. Regards, Jeff Davis ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly