Hello,

I am using the NSXML classes to generate and parse my own XML files. Sometimes 
these files store strings of text that has been brought in from other 
applications (for instance, there might be a plain text representation of some 
text the user has pasted in from Word).

In some instances I am receiving errors in NSXMLDocument's 
-initWithContentsOfURLPreservingWhitespace:error:, causing it to return nil 
with errors such as "Char 0x0 out of allowed range" or "PCDATA invalid char 
value 12". As I understand it, this is because XML doesn't allow certain ranges 
of UTF8 characters:

http://www.w3.org/TR/2000/REC-xml-20001006#NT-Char

Especially:

Character Range
[2] Char ::= #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | 
[#x10000-#x10FFFF] /* any Unicode character, excluding the surrogate blocks, 
FFFE, and FFFF. */

Certainly, the "PCData invalid char" error was caused by an NSFormFeedCharacter 
- I don't know what the "Char 0x0" character is, but it's bound to be one from 
a Word document that isn't allowed.

So, my question is, what is the best way for me to filter out these invalid 
characters from my NSString before I pass it into NSXMLElement's 
-initWithName:stringValue: or similar methods, to avoid creating XML documents 
that won't open?

This page seems useful:

http://cse-mjmcl.cse.bris.ac.uk/blog/2007/02/14/1171465494443.html

It would seem to indicate that I would need to write some code in C to compile 
a string without the invalid characters, and build it into an NSString, but I 
was wondering if there were any methods built into the AppKit that already 
strip these invalid XML characters? I have looked but couldn't see any. If not, 
if anyone could give me any pointers on using the above info to create a method 
that would do this, I would be very grateful. I'm self-taught so all my 
knowledge is high-level Cocoa and Objective-C, so I'd end up doing it all using 
NSString -appendString, -stringWithFormat: methods, which I know would be wrong 
for this as it would be too slow and requires C.

Many thanks in advance for any help anyone can give.

All the best,
Keith


      
_______________________________________________

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Reply via email to