> From: owner-openssl-us...@openssl.org On Behalf Of Pica Pica Contact
> Sent: Saturday, 28 July, 2012 14:41

> My application uses X.509 certificates with commonName field 
> set to following format:
> 
> number#UserName,

> Everything is ok when UserName is in ascii, but when I sign 
> new certificates using <snip: ca ... -subj ... -utf8>
> and subject contains non-ASCII characters in UTF-8 encoding, 
> the resulting certificate's CN looks this way:
> 
> $ openssl x509 -in 30000.pem -subject  -noout
> 
> subject= 
> /CN=\x003\x000\x000\x000\x000\x00#\x04B\x045\x04A\x04B\x10\xE2
> \x10\xD4\x10\xE1\x10\xE2N-V\xFD
> 
> Looks like string "30000" is literally encoded as a sequence 
> of bytes with corresponding decimal values, not as sequence 
> of ASCII codes for characters "3", "0", "0",...

Nope. \xHH is exactly two hex digits for one byte. You have:
  '\x00' '3' '\x00' '0' ... '\x00' '#' '\x04' 'B' '\x04' '5' ... 
That is obviously the UCS-2 (BMPString) encoding of:
U+0033=digit3 U+0030=digit0,repeated4times U+0023=NumberSign 
U+0442=Cyrillic.SmallTE U+0435 U+0441 U+0442 U+10E2=Georgian.LetterTar 
U+10D4 U+10E1 U+10E2 U+4E2E=CJK.something U+56FD=CJK.something 

Note that X.509 certs (and ASN.1 generally) don't actually support 
UTF8. They support several 1-byte codes (some now obsolete), BMPString 
which is 2-byte UCS-2, and UniversalString which is 4-byte UCS-4.
I believe OpenSSL selects the smallest of these into which the 
specified (Unicode) codepoints fit, which in this case is UCS-2.

> After adding -nameopt oneline,-esc_msb,utf8 result looks fine
> 
That should translate the Unicode to UTF8 and output it, and 
assuming your terminal handles UTF8 then yes it will be good

> I call X509_NAME_oneline() function inside my application to 
> get CN string, and application fails to convert number from 
> CN field to integer, because X509_NAME_oneline() returns 
> "/CN=\x003\x000\x000\x000\x000\x00#" instead of "CN=30000#...".
> 
I'm pretty sure _oneline is what x509 -text without -nameopt uses.

> Probably I should use X509_NAME_print_ex(),
> 
Or if you only want CN, you could get the raw CN item and its value 
out of the name structure which in OpenSSL is STACK_OF(X509_NAME_ENTRY).

> but I have doubts if this string encoding is correct and how 
> it would work with other software. For example, certtool from 
> GnuTLS outputs subject string in this way:
> $ certtool -i --infile 30000.pem
> 
> ...skipped...
> 
>         Subject: 
> CN=#003300300030003000300023044204350441044210e210d410e110e24e2d56fd
> ...skipped...
> 
That apparently is dumping the UCS-2 bytes. Compare to above.

> There are no such problems in "openssl req", I can set UTF8 
> strings with numbers in certificate requests and resulting 
> certificate is ok for me, but I need to ignore subject from 
> certificate requests and set my own value
> 
> 
> Is it possible to fix "openssl ca" command somehow to encode 
> numbers in UTF8 strings as strings, not numbers?

'ca' can only encode ASN.1 strings in the ways defined by ASN.1.
You must decode them accordingly.


______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
User Support Mailing List                    openssl-users@openssl.org
Automated List Manager                           majord...@openssl.org

Reply via email to