On 12/04/2002 3:21 PM, Larry Wall wrote:
On Wed, Dec 04, 2002 at 11:38:35AM -0800, Michael Lazzaro wrote:
: We still need to verify whether we can have, in qq strings:
: : \033 - octal (p5; deprecated but allowed in p6?)

I think it's disallowed.
Thank the many gods ... or One True God, or Larry, or whatever your personal preference may be. ("So have a merry Christmas, Happy Hanukah, Kwazy Kwanzaa, a tip-top Tet, and a solemn, dignified Ramadan.")

\0o33 - octal
\0x1b - hex \0d123 - decimal
\0b1001 - binary
\x and \o are then just shortcuts.
Can we please also have \0 as a shortcut for \0x0?

\c[^H], for instance.  We can overload the \c notation to our heart's
desire, as long as we don't conflict with its use for named characters:

    \c[GREEK CAPITAL LETTER OMEGA WITH PSILI AND PERISPOMENI AND PROSGEGRAMMENI]
Very Cool. (BTW, for those that don't follow Unicode, this means that everything matching /^[^A-Z ]$/ is fair game for us; Unicode limits charachter names to that to minimize chicken-and-egg problems. We /probably/ shouldn't take anything in /^[A-Za-z ]$/, to allow people to say the much more readable "\c[Greek Capital Letter Omega with Pepperoni and Pineapple]".

: There is also the question of what the bracketed format does. "Wide" : chars, e.g. for Unicode, seem appropriate only in hex. But it would : seem useful to allow a bracketed form for the others that prevents : ambiguities:
: : "\o164" ne "\o{16}4"
: "\d100" ne "\d{10}0"
: : Whether that means you can actually specify wide chars in \o, \d, and : \b or it's just a disambiguification of the Latin-1 case is open to : question.

There ain't no such thing as a "wide" character. \xff is exactly
the same character as \x[ff].
Which means that the only way to get a string with a literal 0xFF byte in it is with qq:u1[\xFF]? (Larry, I don't know that this has been mentioned before: is that right?) chr:u1(0xFF) might do it too, but we're getting ahead of ourselves.

Also, an annoying corner case: is "\0x1ff" eq "\0x[1f]f", or is it eq "\0x[1ff]"? What about other bases? Is "\0x1x" eq "\0x[1]", or is it eq "\0x[1x]" (IE illegal). (Now that I put those three questions together, the only reasonable answer seems to be that the number ends in the last place it's valid to end if you don't use explicit brackets.)

(BTW, in HTML and XML, numeric character escapes are decimal by default, you have to add a # for hex. In windows and several other OSes (I think, I like to play with Unicode but have little actual use for it), ALT-0nnn is spelt in decimal only. Decimal Unicode ordnals are fundimently flawed (since blocks are always on nice even hex numbers, but ugly decimal ones), but useful anyway).

-=- James Mastros

Reply via email to