Re: Bit ops on strings

Aaron Sherman Sat, 01 May 2004 09:00:35 -0700

On Sat, 2004-05-01 at 11:26, Jarkko Hietaniemi wrote:

As for codepoints outside of \x00-\xff, I vote exception. I don't think
there's any other logical choice, but I think it's just an encoding
conversion exception, not a special bit-op exception (that's arm-waving,
I have not looked at Parrot's exception model yet... miles to go...)


> > This means that UTF-8 strings will be handled just fine, and (as I
> 
> Please don't mix encodings and code points.  That strings might be
> serialized or stored as UTF-8 should have no consequence with bitops.

What I meant was that UTF-8 IS going to be represented in a way that
will guarantee you won't get an exception when trying to do bit-ops. All
bets are off for many other encodings. While you're right that you might
get lucky, that wasn't really the point I was making. Many languages
(Perl included, I think) are going to encode strings as UTF-8 by
default, and this means that in the general case, we should not expect
exceptions to be thrown around any time we do a bit-op and 'A'|'B' will
still be 'C' :-)

> Of course.  But I would expect a horrible flaming death for
> "\x{100}"|+"\x02".

Well, if you consider a string conversion exception to be horrible
flaming death, then I hate to see what you do with a divide-by-zero ;-)

None of your response sounds overly scary to me, so I'll start looking
at what Parrot does NOW for bit-string-ops and see if it needs to mutate
to fit this model. Then I'll add in the rest. Then I get to see what
evil Dan and Leo perform upon my patch ;-)
 
-- 
Aaron Sherman <[EMAIL PROTECTED]>
Senior Systems Engineer and Toolsmith
"It's the sound of a satellite saying, 'get me down!'" -Shriekback

signature.asc
Description: This is a digitally signed message part

Re: Bit ops on strings

Reply via email to