Bugs item #1583946, was opened at 2006-10-24 14:32 Message generated for change (Comment added) made by akuchling You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1583946&group_id=5470
Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Python Library Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: John Nagle (nagle) Assigned to: Nobody/Anonymous (nobody) Summary: SSL "issuer" and "server" names cannot be parsed Initial Comment: (Python 2.5 library) The Python SSL object offers two methods from obtaining the info from an SSL certificate, "server()" and "issuer()". These return strings. The actual values in the certificate are a series of key /value pairs in ASN.1 binary format. But what "server()" and "issuer()" return are single strings, with the key/value pairs separated by "/". However, "/" is a valid character in certificate data. So parsing such strings is ambiguous, and potentially exploitable. This is more than a theoretical problem. The issuer field of Verisign certificates has a "/" in the middle of a text field: "/O=VeriSign Trust Network/OU=VeriSign, Inc./OU=VeriSign International Server CA - Class 3/OU=www.verisign.com/CPS Incorp.by Ref. LIABILITY LTD.(c)97 VeriSign". Note the "OU=Terms of use at www.verisign.com/rpa (c)00" with a "/" in the middle of the value field. Oops. Worse, this is potentially exploitable. By ordering a low-level certificate with a "/" in the right place, you can create the illusion (at least for flawed implementations like this one) that the certificate belongs to someone else. Just order a certificate from GoDaddy, enter something like this in the "Name" field "Myphonyname/C=US/ST=California/L=San Jose/O=eBay Inc./OU=Site Operations/CN=signin.ebay.com" and Python code will be spoofed into thinking you're eBay. Fortunately, browsers don't use Python code. The actual bug is in python/trunk/Modules/_ssl.c at if ((self->server_cert = SSL_get_peer_certificate(self->ssl))) { X509_NAME_oneline(X509_get_subject_name(self->server_cert), self->server, X509_NAME_MAXLEN); X509_NAME_oneline(X509_get_issuer_name(self->server_cert), self->issuer, X509_NAME_MAXLEN); The "X509_name_oneline" function takes an X509_NAME structure, which is the certificate system's representation of a list, and flattens it into a printable string. This is a debug function, not one for use in production code. The SSL documentation for "X509_name_oneline" says: "The functions X509_NAME_oneline() and X509_NAME_print() are legacy functions which produce a non standard output form, they don't handle multi character fields and have various quirks and inconsistencies. Their use is strongly discouraged in new applications." What OpenSSL callers are supposed to do is call X509_NAME_entry_count() to get the number of entries in an X509_NAME structure, then get each entry with X509_NAME_get_entry(). A few more calls will obtain the name/value pair from the entry, as UTF8 strings, which should be converted to Python UNICODE strings. OpenSSL has all the proper support, but Python's shim doesn't interface to it correctly. X509_NAME_oneline() doesn't handle Unicode; it converts non-ASCII values to "\xnn" format. Again, it's for debug output only. So what's needed are two new functions for Python's SSL sockets to replace "issuer" and "server". The new functions should return lists of Unicode strings representing the key/value pairs. (A list is needed, not a dictionary; two strings with the same key are both possible and common.) The reason this now matters is that new "high assurance" certs, the ones that tell you how much a site can be trusted, are now being deployed, and to use them effectively, you need that info. Support for them is in Internet Explorer 7, so they're going to be widespread soon. Python needs to catch up. And, of course, this needs to be fixed as part of Unicode support. John Nagle Animats ---------------------------------------------------------------------- >Comment By: A.M. Kuchling (akuchling) Date: 2006-10-27 08:54 Message: Logged In: YES user_id=11375 I've reworded the description in the documentation to say something like this: "Returns a string describing the issuer of the server's certificate. Useful for debugging purposes; do not parse the content of this string because its format can't be parsed unambiguously." For adding new features: please submit a patch. Python's maintainers probably don't use SSL in any sophisticated way and therefore have no idea what shape better SSL/X.509 support would take. ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2006-10-25 14:05 Message: Logged In: YES user_id=21627 Notice that RFC 2253 has been superceded by RFC 4514 (see my earlier message). However, I really see no reason to fix this: even if the ambiguity problems were fixed, you *still* should not use the issuer and subject names in a security-relevant context. ---------------------------------------------------------------------- Comment By: John Nagle (nagle) Date: 2006-10-25 13:26 Message: Logged In: YES user_id=5571 Actually, they don't do what they're "designed to do". According to the Python library documentation for SSL objects, the server method "Returns a string containing the ASN.1 distinguished name identifying the server's certificate. (See below for an example showing what distinguished names look like.)" The example "below" is missing from the documentation, so the documentation gives us no clue of what to expect. There are several standardized representations for ASN.1 information. See "http://www.oss.com/asn1/tutorial/Explain.html" Most are binary. The only standard textual form is "XER", which is an XML representation of ASN.1 encoded information. It's essentially the same representation used for parameters in SOAP. So, given the documentation and the standard, what should be coming out is the XML representation of that data. Here's an entire X.509 certificate in XML: http://www.gnu.org/software/gnutls/manual/html_node/An-X_002e509-certificate.html The "issuer" field can be seen in there. It's awfully bulky. And making SSL dependent on the SOAP module probably isn't desireable. But that's an ASN.1 distinguished name in XML format, per the standard. That's probably not what's wanted by most users, although the ability to retrieve an entire certificate in XML format would be useful. However, there's another standard string encoding, which is defined in RFC2253. This is comma-separated UTF-8 with backslash escapes for special characters. That's reliably parseable. There's an openSSL function, "X509_NAME_print_ex", which does this formatting, but it doesn't output to a string. That's the right mechanism if it can be invoked in some way to yield a string. It should be invoked with flags = ASN1_STRFLGS_RFC2253, which yields a UTF8 string, which of course should become a Python Unicode string. Now if someone can figure out how to get a string, instead of file output, out of OpenSSL's "X509_NAME_print_ex", we're home. ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2006-10-25 04:38 Message: Logged In: YES user_id=21627 The bug is not in the the server() and issuer() methods (which do exactly what they are meant to do); the bug is in applications which assume that the result of these methods can be parsed. As you point out, it cannot. The functions, as is, don't present a security problem. If their result is presented as-is to the user, the user can determine herself whether she recognizes the entity referred-to in the distinguished name. Notice that it is certainly possible to produce an unambigous string representation of a distinguished name; RFC 4514 specifies an algorithm to do so (for use within LDAP). Also notice that that the SSL module does little to actually support trust: there is no verification of server-side certs, no access to extensions of a certificate, etc. So an application and a user should *not* trust the issuer name it received, anyway (unless there is an independent verification that the server certificate can be trusted). All that said: If you think you need this functionality, please provide a patch to implement it. ---------------------------------------------------------------------- Comment By: John Nagle (nagle) Date: 2006-10-24 18:40 Message: Logged In: YES user_id=5571 The problem isn't in the version of OpenSSL used in Python, which is at 0.9.8a. OpenSSL has had the necessary functions for years. But Python isn't using them. It's in "python/trunk/Modules/_ssl.c", as described above. ---------------------------------------------------------------------- Comment By: Gregory P. Smith (greg) Date: 2006-10-24 18:05 Message: Logged In: YES user_id=413 Yes OpenSSL 0.9.8d or later should be used for a new binary release. http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2006-4343 http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2006-3738 http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2006-2940 http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2006-2937 ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1583946&group_id=5470 _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com