I found that it is possible to get invalid strings from a PowerPoint file using applescript. The invalid string can not be converted to UTF-8 and corrupt NSXMLDocument.

The problem occur when iterating paragraphs in a shape. Iterating lines returns correct string. However, the issue is that NSString accept invalid data without any error, and returns an invalid instance. You can trim the invalid instance, get a mutable copy and replace characters etc. When you try to use it in NSXMLDocument, it will corrupted the output silently.

When you try to log such string with NSLog - it fails silently - the log line never appear! CFShow does print the string and show the some junk inside it.

On 10.4.11, the string is truncated by applescript. It can be converted to UTF-8 and logged with NSLog. When using in NSXMLDocument, it still corrupt the document, but does not truncate it.

I reported the bug (#5775749), but I guess that others would like to know about this issue.

Here is example code that show the bug. To reproduce, download the source and example files from <http://nirs.freeshell.org/files/ invalid-string.tbz>


// Run this in the directory where "Slide Text.scpt" is located.
// Compile: cc invalid-string.m -o invalid-string -framework Cocoa

#import <Cocoa/Cocoa.h>

int main () {
    NSAutoreleasePool * pool = [[NSAutoreleasePool alloc] init];

    NSDictionary *errorInfo = nil;

    // Get a list of shape text

    NSURL *url = [NSURL fileURLWithPath:@"Slide Text.scpt"];

NSAppleScript *script = [[[NSAppleScript alloc] initWithContentsOfURL:url error:&errorInfo] autorelease];
    if (script == nil) {
        NSLog(@"Cannot load script: %@", errorInfo);
        exit(1);
    }

NSAppleEventDescriptor *result = [script executeAndReturnError:&errorInfo];
    if (result == nil) {
        NSLog(@"Script error: %@", errorInfo);
        exit(1);
    }

    NSString *slideText = [result stringValue];


    //// Bugs:


// 1. The string contains junk - probably PowerPoint bug - but NSString should return nil or truncate the invalid data

    CFShow(slideText);


// 2. NSLog fail silently when printing this string - the log line is simply missing!

    NSLog(@"slide text: %@", slideText);


    // 3. The string can not be converted to utf8 (returns NULL):

    printf("utf8 string: %s\n", [slideText UTF8String]);


// 4. xml data is corrupted without any error; the element containing the invalid string is missing, part of the string apear, and the document is truncated after the invalid string:

    NSXMLDocument *doc = [NSXMLDocument document];
    [doc setCharacterEncoding:@"UTF-8"];
    NSXMLElement *root = [NSXMLElement elementWithName:@"doc"];
    [doc setRootElement:root];
    NSXMLElement *slide = [NSXMLElement elementWithName:@"slide"];
    [root addChild:slide];
    [slide setStringValue:slideText];
    NSData *data = [doc XMLDataWithOptions:NSXMLNodePrettyPrint];
    NSString *xmlString = [[[NSString alloc] initWithData:data
encoding:NSUTF8StringEncoding] autorelease];

    printf("%s\n", [xmlString UTF8String]);


    [pool release];
    return 0;
}



Best Regards,

Nir Soffer

_______________________________________________

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to [EMAIL PROTECTED]

Reply via email to