> From: owner-openssl-us...@openssl.org On Behalf Of Pica Pica Contact > Sent: Saturday, 28 July, 2012 14:41
> My application uses X.509 certificates with commonName field > set to following format: > > number#UserName, > Everything is ok when UserName is in ascii, but when I sign > new certificates using <snip: ca ... -subj ... -utf8> > and subject contains non-ASCII characters in UTF-8 encoding, > the resulting certificate's CN looks this way: > > $ openssl x509 -in 30000.pem -subject -noout > > subject= > /CN=\x003\x000\x000\x000\x000\x00#\x04B\x045\x04A\x04B\x10\xE2 > \x10\xD4\x10\xE1\x10\xE2N-V\xFD > > Looks like string "30000" is literally encoded as a sequence > of bytes with corresponding decimal values, not as sequence > of ASCII codes for characters "3", "0", "0",... Nope. \xHH is exactly two hex digits for one byte. You have: '\x00' '3' '\x00' '0' ... '\x00' '#' '\x04' 'B' '\x04' '5' ... That is obviously the UCS-2 (BMPString) encoding of: U+0033=digit3 U+0030=digit0,repeated4times U+0023=NumberSign U+0442=Cyrillic.SmallTE U+0435 U+0441 U+0442 U+10E2=Georgian.LetterTar U+10D4 U+10E1 U+10E2 U+4E2E=CJK.something U+56FD=CJK.something Note that X.509 certs (and ASN.1 generally) don't actually support UTF8. They support several 1-byte codes (some now obsolete), BMPString which is 2-byte UCS-2, and UniversalString which is 4-byte UCS-4. I believe OpenSSL selects the smallest of these into which the specified (Unicode) codepoints fit, which in this case is UCS-2. > After adding -nameopt oneline,-esc_msb,utf8 result looks fine > That should translate the Unicode to UTF8 and output it, and assuming your terminal handles UTF8 then yes it will be good > I call X509_NAME_oneline() function inside my application to > get CN string, and application fails to convert number from > CN field to integer, because X509_NAME_oneline() returns > "/CN=\x003\x000\x000\x000\x000\x00#" instead of "CN=30000#...". > I'm pretty sure _oneline is what x509 -text without -nameopt uses. > Probably I should use X509_NAME_print_ex(), > Or if you only want CN, you could get the raw CN item and its value out of the name structure which in OpenSSL is STACK_OF(X509_NAME_ENTRY). > but I have doubts if this string encoding is correct and how > it would work with other software. For example, certtool from > GnuTLS outputs subject string in this way: > $ certtool -i --infile 30000.pem > > ...skipped... > > Subject: > CN=#003300300030003000300023044204350441044210e210d410e110e24e2d56fd > ...skipped... > That apparently is dumping the UCS-2 bytes. Compare to above. > There are no such problems in "openssl req", I can set UTF8 > strings with numbers in certificate requests and resulting > certificate is ok for me, but I need to ignore subject from > certificate requests and set my own value > > > Is it possible to fix "openssl ca" command somehow to encode > numbers in UTF8 strings as strings, not numbers? 'ca' can only encode ASN.1 strings in the ways defined by ASN.1. You must decode them accordingly. ______________________________________________________________________ OpenSSL Project http://www.openssl.org User Support Mailing List openssl-users@openssl.org Automated List Manager majord...@openssl.org