I have a quick question about the Str type, described in Synopsis 2:
Str Perl string (finite sequence of Unicode characters)
Specifically, and partly in the interest in future-proofing, is there support in
Str for representing codepoint numbers that are beyond the range currently
described in the Unicode spec; eg, can someone validly say "\x[263a123456789]"
and pass around said as a Str value?
Or would there potentially be language constraints to prevent such from
compiling/executing?
I think it would be useful for the above to be allowed so that one could still
encode future larger codepoints under an older Perl that doesn't attribute any
meaning to them, and just falls back to treating the Str as a generic string of
integers, that is what happens by default when you don't have special character
tables handy AFAIK.
That's not to say you can't also have a stricter subtype defined, eg Uni5_1Str,
which includes just the characters defined by Unicode version 5.1, where people
want to use that.
So if Perl's Str is lax in this way I think it should be documented somewhere
that a Str may contain a sequence of potential and not just actual Unicode
characters. Or if that already is documented, please say where.
And I want to emphasize that I'm not proposing changing the logical/conceptual
meaning of Str, it is still defined as a string of characters, not as a string
of integers.
One reason I'm asking is that I wanted to make the Text type of my Muldis D
language support arbitrarily large codepoints partly for future-proofing, and
I'm hoping to be able to say that when mapping the language to Perl 6 that any
Text value can be represented simply by a Perl 6 Str value. But if Perl 6's Str
isn't likely to be that flexible then I'd like to know for my planning purposes.
Thank you. -- Darren Duncan