On Aug 19, 2010, at 19:27, Brad Stone wrote:

> Can someone help me figure out why (brackets not included)  [•        m]  
> (which is, on the Mac, an option-8 character, a tab character and a lowercase 
> m) converts to (brackets not included [• m] ?
> 
> The source text  is quoted-printable UTF-8 text I created and saved in an XML 
> file in a different application.  All the other characters and line returns 
> translate perfectly but this option-8 tab combination does not.
> 
> Here is my source code.
> 
> Thanks
> 
> - (NSString *)stringWithQuotedPrintableString:(const char *)qpString {
>       
>    const char *p = qpString;
>    char *ep, *utf8_string = malloc(strlen(qpString) * sizeof(char));
>    NSParameterAssert( utf8_string );
>    ep = utf8_string;
>       
>       
>               
>    while( *p ) {
>               
>        switch( *p ) {
>                       case '=':
>                                                               
>                               NSAssert1( *(p + 1) != 0 && *(p + 2) != 0, 
> @"Malformed QP String: %s", qpString);
>                               if( *(p + 1) != '\r' ) {
>                                       int i, byte[2];
>                                       for( i = 0; i < 2; i++ ) {
>                                               byte[i] = *(p + i + 1);
>                                               if( isdigit(byte[i]) )
>                                                       byte[i] -= 0x30;
>                                               else
>                                                       byte[i] -= 0x37;
>                                               
>                                               if (byte[i] >= 0 && byte[i] < 
> 16) {
>                                                       continue;
>                                               }
>                                               
>                                               NSAssert( byte[i] >= 0 && 
> byte[i] < 16, @"bad encoded character");
>                                       }
>                                       *(ep++) = (char) (byte[0] << 4) | 
> byte[1];
>                               }
>                               p += 3;
>                               continue;
>                       default:
>                               *(ep++) = *(p++);
>                               continue;
>        }
>    }
>       return [[NSString alloc] initWithBytesNoCopy:utf8_string 
> length:strlen(utf8_string) encoding:NSUTF8StringEncoding freeWhenDone:YES];
> }

It would be a help if you could show the hex bytes actually being passed to 
'initWithBytesNoCopy...'.

I note, however, that if your input contains the sequence "=\r", your code 
above will eat the character following the '\r'. That sort of looks like a bug.


_______________________________________________

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Reply via email to