In recent times there has been an increasing amount of user questions that indicate that some sort of functionality is missing in the available escape sequences in string literals for "un-enterable" characters. This problem manifests itself in one of two ways:
1. A user wants to enter a character with a known Unicode codepoint (a 16-bit value). He cannot directly use the \ddd notation, he needs to manually convert the 16-bit value to UTF-8, which results in a sequence of one or more bytes. This conversion is pretty hard to do manually and obviously not nice. (This case assumes the user knows the server encoding is UTF-8.) 2. A user wants to enter a random character in his client encoding, but he cannot enter it in his keyboard. Say you want to enter the Euro sign. The Euro sign is decimal 164, so you try '\244'. But the byte value represented by this escape mechanism is interpreted in the server encoding, and if you don't know that (which you shouldn't be required to), you cannot use this. If the server encoding is UTF-8, this is an illegal byte sequence. Obviously, the \ddd notation missed the train when the world was introduced to multibyte encodings and encoding conversion. I guess we cannot change it anymore, but we need a new mechanism. One possibility is to introduce the notation from Java, '\uXXXX' (hexadecimal digits) to designate a Unicode character. This would then be converted to whatever the server encoding is. Obviously, this would solve problem #1 from above. Problem #2 would be solved in an indirect way, the user would then have to look up the codepoint in Unicode always, instead of in the client encoding. Another possibility is to introduce a new notation that designates a specific code point in the client encoding. Say we call it '\yXXXX', then if your client encoding is ISO-8859-15 you can enter a Euro sign using '\yA4', if your client encoding is UTF-8 you can enter it using '\y20AC'. I'm not sure, however, whether all encodings know the concept of a codepoint. If you're concerned about adding more nonstandard escape sequences or how to implement them given the variable-length data after the magic letter, you can also think of these as a new function, so you could write: 'The price is ' || unicode(0x20AC) || ' 200.' This is uglier but more flexible. Comments/better ideas? -- Peter Eisentraut [EMAIL PROTECTED] ---------------------------(end of broadcast)--------------------------- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])