On Wed, 27 Sep 2000, Dan Sugalski wrote:

> Well, the impression I get from Larry is that he wants the internals to be 
> character-formatting neutral. Scalars should be able to provide their 
> contents in a number of different formats, but perl would usually not have 
> One True Format internally. 

> I think the intention is that if data comes in in ASCII or EBCDIC or 
> Unicode or Shift-JIS or binary or whatever then perl will keep it that way 
> if it understands the format, and convert as needed by bits of the core.

This seems sensible, but "convert as needed" is going to be the tricky
part:-) and we'll need to allow for appropriate directives to keep the
DWIMmer from getting confused in unusual situations.

In broad terms, here's what I'd like to see from my end-user point of
view:

I use perl5 to deal with binary data and hope to use perl6 in the same
way. I'd like to be able to be *sure* that perl hasn't mistakenly changed
my data because it "looked like" a valid UTF8 sequence or whatever.  I
want to be able to use other modules that may be unicode-aware and yet
still have confidence that they haven't changed my data.  I don't want
some random module to automatically kick in a line discipline because my
image of a dendritic crystal happens to have an intensity at the upper
left hand corner that exactly matches the sequence of bits used to specify
some byte-order-mark or another.

I don't mind having to be explicit about telling perl6 to leave my bits
alone -- easy text processing probably ought to remain perl's strength --
nor do I mind too much if perl6 occasionally has to simply give up and die
due to utter internals confusion.

I don't think there's any real disagreement on these ideas, but there is
one additional issue that I think hasn't been fully appreciated: 

I want to be able to do all this *without* having to become a Unicode
expert (or even Unicode-literate).  It has always been possible to program
in a subset of Perl without knowing all the gnarly details of all the
other parts.  I don't want to have to inspect each and every release of
each and every module I use to see if, perhaps this release, some UTF
DWIMmery has been implemented that might foul me up.  I want to be able to
say at the top of my program something like

        no unicode qw(nada nil none leave_my_bits_alone);

and have it all "just work". 

Anything more complicated, and it seems to me that perl6 will have taken a
step towards making an easy task harder.

Sorry to rant like a toddler and say "I want" over and over again, but the
phrase "convert as needed" makes warning sirens go off in my head :-).

-- 
    Andy Dougherty              [EMAIL PROTECTED]
    Dept. of Physics
    Lafayette College, Easton PA 18042

Reply via email to