On 4/14/09, Peter Eisentraut <pete...@gmx.net> wrote: > On Saturday 11 April 2009 00:54:25 Tom Lane wrote: > > It gets worse though: I have seldom seen such a badly designed piece of > > syntax as the Unicode string syntax --- see > > http://developer.postgresql.org/pgdocs/postgres/sql-syntax-lexical.html#SQL > >-SYNTAX-STRINGS-UESCAPE > > > > You scan the string, and then after that they tell you what the escape > > character is!? Not to mention the obvious ambiguity with & as an > > operator. > > > > If we let this go into 8.4, our previous rounds with security holes > > caused by careless string parsing will look like a day at the beach. > > No frontend that isn't fully cognizant of the Unicode string syntax is > > going to parse such things correctly --- it's going to be trivial for > > a bad guy to confuse a quoting mechanism as to what's an escape and what > > isn't. > > > Note that the escape character marks the Unicode escapes; it doesn't affect > the > quote characters that delimit the string. So offhand I can't see any > potential > for quote confusion/SQL injection type problems. Please elaborate if you see > a problem. > > If there are problems, we could consider getting rid of the UESCAPE clause. > Without it, the U&'' strings would behave much like the E'' strings. But I'd > like to understand the problem first.
I think the problem is that they should not act like E'' strings, but they should act like plain '' strings - they should follow stdstr setting. That way existing tools that may (or may not..) understand E'' and stdstr settings, but definitely have not heard about U&'' strings can still parse the SQL without new surprises. If they already act that way then keeping U& should be fine. And if UESCAPE does not affect main string parsing, but is handled in second pass going over parsed string - like bytea \ - then that should also be fine and should not cause any new surprises. But if not, it must go. I would prefer that such quoting extensions would wait until stdstr=on setting is the only mode Postgres will operate. Fitting new quoting ways to environment with flippable stdstr setting will be rather painful for everyone. I still stand on my proposal, how about extending E'' strings with unicode escapes (eg. \uXXXX)? The E'' strings are already more clearly defined than '' and they are our "own", we don't need to consider random standards, but can consider our sanity. -- marko -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers