On Wed, 27 Sep 2000, Dan Sugalski wrote:
> Well, the impression I get from Larry is that he wants the internals to be
> character-formatting neutral. Scalars should be able to provide their
> contents in a number of different formats, but perl would usually not have
> One True Format internally.
> I think the intention is that if data comes in in ASCII or EBCDIC or
> Unicode or Shift-JIS or binary or whatever then perl will keep it that way
> if it understands the format, and convert as needed by bits of the core.
This seems sensible, but "convert as needed" is going to be the tricky
part:-) and we'll need to allow for appropriate directives to keep the
DWIMmer from getting confused in unusual situations.
In broad terms, here's what I'd like to see from my end-user point of
view:
I use perl5 to deal with binary data and hope to use perl6 in the same
way. I'd like to be able to be *sure* that perl hasn't mistakenly changed
my data because it "looked like" a valid UTF8 sequence or whatever. I
want to be able to use other modules that may be unicode-aware and yet
still have confidence that they haven't changed my data. I don't want
some random module to automatically kick in a line discipline because my
image of a dendritic crystal happens to have an intensity at the upper
left hand corner that exactly matches the sequence of bits used to specify
some byte-order-mark or another.
I don't mind having to be explicit about telling perl6 to leave my bits
alone -- easy text processing probably ought to remain perl's strength --
nor do I mind too much if perl6 occasionally has to simply give up and die
due to utter internals confusion.
I don't think there's any real disagreement on these ideas, but there is
one additional issue that I think hasn't been fully appreciated:
I want to be able to do all this *without* having to become a Unicode
expert (or even Unicode-literate). It has always been possible to program
in a subset of Perl without knowing all the gnarly details of all the
other parts. I don't want to have to inspect each and every release of
each and every module I use to see if, perhaps this release, some UTF
DWIMmery has been implemented that might foul me up. I want to be able to
say at the top of my program something like
no unicode qw(nada nil none leave_my_bits_alone);
and have it all "just work".
Anything more complicated, and it seems to me that perl6 will have taken a
step towards making an easy task harder.
Sorry to rant like a toddler and say "I want" over and over again, but the
phrase "convert as needed" makes warning sirens go off in my head :-).
--
Andy Dougherty [EMAIL PROTECTED]
Dept. of Physics
Lafayette College, Easton PA 18042