On Mar 25, 2014, at 11:28 , Kyle Sluder <k...@ksluder.com> wrote: > On Tue, Mar 25, 2014, at 10:49 AM, Quincey Morris wrote: >> However, I also see this as a bug in your code, since you’re accepting >> “random” user input as formatted text (i.e. escaped HTML) without >> validation. > > Unfortunately, NSTextView lets users paste invalid UTF-16 codepoints > directly into the NSTextStorage that backs the text view. We see this > happen with OmniOutliner documents on occasion. Then the next time we > try to load the document, libxml barfs on the invalid character > entities.
“accepting … without validation” meant, in this context, setting the NSString as the value of a Core Data property. The underlying problem is that NSString objects are (in general, AFAIK) merely sequences of UTF-16 code units, not sequences of *valid* UTF-16 code units, so that there are valid NSStrings that aren’t valid Unicode. For example, I mean, you can AFAIK append the low surrogate 0xD800 to a NSString without it throwing an exception saying it isn’t followed by a high surrogate code unit. That difference — sequences vs valid sequences — suggests that an NSString of unknown provenance is always a suspect Unicode string. That suggests that in a properly suspicious app, no NSString should be admitted into a persistent store without having been validated. In those terms, the problem in OmniOutliner wasn’t that it was handed a buggily invalid NSTextStorage, but that it too accepted input without validation. _______________________________________________ Cocoa-dev mailing list (Cocoa-dev@lists.apple.com) Please do not post admin requests or moderator comments to the list. Contact the moderators at cocoa-dev-admins(at)lists.apple.com Help/Unsubscribe/Update your Subscription: https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com This email sent to arch...@mail-archive.com