On Sat, 2004-05-01 at 11:26, Jarkko Hietaniemi wrote: As for codepoints outside of \x00-\xff, I vote exception. I don't think there's any other logical choice, but I think it's just an encoding conversion exception, not a special bit-op exception (that's arm-waving, I have not looked at Parrot's exception model yet... miles to go...)
> > This means that UTF-8 strings will be handled just fine, and (as I > > Please don't mix encodings and code points. That strings might be > serialized or stored as UTF-8 should have no consequence with bitops. What I meant was that UTF-8 IS going to be represented in a way that will guarantee you won't get an exception when trying to do bit-ops. All bets are off for many other encodings. While you're right that you might get lucky, that wasn't really the point I was making. Many languages (Perl included, I think) are going to encode strings as UTF-8 by default, and this means that in the general case, we should not expect exceptions to be thrown around any time we do a bit-op and 'A'|'B' will still be 'C' :-) > Of course. But I would expect a horrible flaming death for > "\x{100}"|+"\x02". Well, if you consider a string conversion exception to be horrible flaming death, then I hate to see what you do with a divide-by-zero ;-) None of your response sounds overly scary to me, so I'll start looking at what Parrot does NOW for bit-string-ops and see if it needs to mutate to fit this model. Then I'll add in the rest. Then I get to see what evil Dan and Leo perform upon my patch ;-) -- Aaron Sherman <[EMAIL PROTECTED]> Senior Systems Engineer and Toolsmith "It's the sound of a satellite saying, 'get me down!'" -Shriekback
signature.asc
Description: This is a digitally signed message part