Re: NSXML and invalid UTF8 characters

Jens Alfke Thu, 28 Jan 2010 19:25:07 -0800

On Jan 28, 2010, at 3:47 PM, Keith Blount wrote:

> Many thanks for your reply. Wouldn't using these methods be a lot more 
> expensive (and slower) than going through using -characterAtIndex: or 
> something similar, accessing the characters directly, though?


No, because it's more efficient to let NSString itself do the searching, 
avoiding the overhead of a message-send per character.

> I'm thinking that I would have to add every character to the character set 
> and then let NSString deal with all the underlying character stuff this way, 
> whereas if I could check the unicode char is within a range then it would be 
> faster.

You can easily create an NSCharacterSet on any range of Unicode values.

BTW, it's inaccurate to say "invalid UTF-8". UTF-8 is just an encoding of 
Unicode. You're talking about Unicode characters that are illegal in XML. (I 
bring this up because there is such a thing as invalid UTF-8, i.e. byte 
sequences that are invalid in UTF-8 encoding, but it's an entirely different 
issue; this confused me when I first read your message.)

—Jens

_______________________________________________

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: NSXML and invalid UTF8 characters

Reply via email to