[issue37845] SLCertVerificationError: Unable to handle SAN names (from Certifications) published with white spaces at start
New submission from David K. : Unable to establish SSL connections using company's private certificates where their SANs (Subject Alternative Names) contain at least one DNS Name that starts with white spaces. Attempting to establish SSL connection would result in Exception: SSLCertVerificationError("partial wildcards in leftmost label are not supported: ' *.x.y.com'.") This situation made us co-depended on SecOps in a big company where ultimately all other none-python apps weren't effected by that change they made and thus couldn't or wouldn't fix the problem on their side for us. (We were at their mercy!) I originally encountered this bug @ Python 3.7 and fixed it manually on my own local Python environment. As the bug seems to be still unfixed to date, I publish this issue. A small and simple fix will follow shortly on github. -- assignee: christian.heimes components: SSL messages: 349600 nosy: DK26, christian.heimes priority: normal severity: normal status: open title: SLCertVerificationError: Unable to handle SAN names (from Certifications) published with white spaces at start type: security versions: Python 3.9 ___ Python tracker <https://bugs.python.org/issue37845> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue37845] SLCertVerificationError: Unable to handle SAN names (from Certifications) published with white spaces at start
Change by David K. : -- keywords: +patch pull_requests: +14979 stage: -> patch review pull_request: https://github.com/python/cpython/pull/15260 ___ Python tracker <https://bugs.python.org/issue37845> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue37845] SLCertVerificationError: Unable to handle SAN names (from Certifications) published with white spaces at start
David K. added the comment: Hi, Judging by your comment, I think there is a an unfortnate misunderstanding. If you'd be kind enough, please let me explain: 1. The issue I had was indeed on Python 3.7, using the highly used "requests" library. Also my change was -not- applied on the deprecated ssl.match_hostname() but on _dnsname_match() which is a method of another inner class. My point is that it's still releveant. 2. Although there is a thin line here, it is not a configuration issue by its classic meaning. It's an outside condition in a production environment that unpatches Python code cannot handle and thus implicating that Python is less stable and mistake tolerant than C# and Java (those are the other more used languages in the company which weren't effected by this problem -- for some who wouldn't bother like me to patch python source code, this could be a deal breaker to move to another language). 3. It's a very simple fix that only removes white spaces (empty chars) from start and end of the DNS before applying all the other tests on it. In fact by assuring the input of the DNS name, our code becomes -more- secure. In current state, a missed type DNS name encoded in the certifications could cause unknowingly Python deamons to stop transmiting data. Also as humans tend to make such naive errors, a mallicious party could make an attack be seemless and be discarded as human error. And if that doesn't convince you, we can say at the very least the service we provide with our App becomes unusable and unavailable to clients and for some that could cost time and money and Python may take the blame as unreliable as compared to other languages. 4. The thrown exception can be misleading: The exception says that the problem is a partial wildcard. However the problem is white spaces which can be difficult to spot. White spaces cannot be part of DNS names thus it makes no sense to ackonwledge them or refer to them or event test them as any other legit legal char. Also this is unpredictible to the programmer as he wouldn't think such a basic trim/strip of white spaces wouldn't happen in the core of the SSL code what's worse, it can't be handled conventialy with catching the exception. While a programmer can edit Python source code to it's needs, they really shouldn't mess with it for more than a short term use. Declining the change dooms me for example to always add this change to projects using SSL. Thank you for your time. I truely hope we can resolve this. D.K On Wed, Aug 14, 2019, 08:09 Christian Heimes wrote: > > Christian Heimes added the comment: > > This is not a bug in Python but a misconfiguration on your side. A > workaround for a misconfiguration doesn't belong into upstream code. The > certificate validation code is security-sensitive and I don't feel > comfortable to add unnecessary string transformation to it. The code > refuses bad wildcards because we have had more than one CVE related to > wildcard matching. > > Besides the ssl.match_hostname() function is deprecated and no longer > used. Starting with Python 3.7 the ssl module uses OpenSSL to verify host > names. > > I suggest that you either ship this fix locally with your app. Or talk to > IT again and have them replace the wrong certificate with a correct one > that does not violate the standards. > > -- > resolution: -> rejected > stage: patch review -> resolved > status: open -> closed > > ___ > Python tracker > <https://bugs.python.org/issue37845> > ___ > -- ___ Python tracker <https://bugs.python.org/issue37845> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue37845] SLCertVerificationError: Unable to handle SAN names (from Certifications) published with white spaces at start
David K. added the comment: OK, I see your point :) Modification of the original certificiation is legally problematic. Much thanks for the patience and time to explain, D.K On Wed, Aug 14, 2019, 17:23 Christian Heimes wrote: > > Christian Heimes added the comment: > > On 14/08/2019 15.37, David K. wrote: > > > > David K. added the comment: > > > > Hi, > > > > Judging by your comment, I think there is a an unfortnate > misunderstanding. > > > > If you'd be kind enough, please let me explain: > > > > 1. The issue I had was indeed on Python 3.7, using the highly used > > "requests" library. Also my change was -not- applied on the deprecated > > ssl.match_hostname() > > but on _dnsname_match() which is a method of another inner class. > > My point is that it's still releveant. > > Except it's not revelant any more because that function is no longer > used by CPython's ssl module. Neither match_hostname nor any of its > helper functions are called by code in the ssl module. Since 3.7 all > hostname verification is now performed by OpenSSL code directly. > ssl.match_hostname will be removed in 3.9 or 3.10. > > Latest urllib3 and requests don't use ssl.match_hostname() either. The > urllib3 package has an older copy of the hostname matching algorithm in > urllib3.packages.ssl_match_hostname. It should not be used with modern > Python. > > > 2. Although there is a thin line here, it is not a configuration issue by > > its classic meaning. It's an outside condition in a production > environment > > that unpatches Python code cannot handle and thus implicating that Python > > is less stable and mistake tolerant than C# and Java (those are the other > > more used languages in the company which weren't effected by this problem > > -- for some who wouldn't bother like me to patch python source code, this > > could be a deal breaker to move to another language). > > Your setup has a misconfigured X.509 certificate with a SAN entry that > violates standards for certificates. You are asking me to introduce a > security into Python as a workaround for the misconfiguration. > > The algorithm in match_hostname() and _dnsname_match() implements RFC > 6125, section 6.4.3, in a strict way. Python not only refuses to match > invalid wildcard entries, it also fails hard on RFC 6125 violations. > > > 3. It's a very simple fix that only removes white spaces (empty chars) > from > > start and end of the DNS before applying all the other tests on it. In > fact > > by assuring the input of the DNS name, our code becomes -more- secure. In > > current state, a missed type DNS name encoded in the certifications could > > cause unknowingly Python deamons to stop transmiting data. Also as humans > > tend to make such naive errors, a mallicious party could make an attack > be > > seemless and be discarded as human error. And if that doesn't convince > > you, we can say at the very least the service we provide with our App > > becomes unusable and unavailable to clients and for some that could cost > > time and money and Python may take the blame as unreliable as compared to > > other languages. > > You view the fix as simple and harmless. I see it as a violation of > standards and a security bug. X.509 certs are complex and fragile > beasts. Python have had several CVEs in the hostname matching code > because we didn't implement it correctly. Certificates are also used in > legal contracts, e.g. to legally sign documents or establish mutual > trust. You cannot just modify the content of a certificate. > > Since you are worried about the reliability of Python and started > talking about money, have you considered to donate money to the PSF? > Python is maintained by unpaid volunteers. Donations to the Python > Software Foundation help. (Disclaimer: No money in the world will change > my opinion about "dn.strip()".) > > > 4. The thrown exception can be misleading: The exception says that the > > problem is a partial wildcard. However the problem is white spaces which > > can be difficult to spot. White spaces cannot be part of DNS names thus > it > > makes no sense to ackonwledge them or refer to them or event test them as > > any other legit legal char. Also this is unpredictible to the programmer > as > > he wouldn't think such a basic trim/strip of white spaces wouldn't happen > > in the core of the SSL code what's worse, it can't be handled conventialy > > with catching the exception. While a programmer can edit Python source > code > > to it
[issue38656] mimetypes for python 3.7.5 fails to detect matroska video
David K. Hess added the comment: Hi, I'm the author of the commit that's been fingered. Some comments about the behavior being reported First, as pointed out by @xtreak, indeed the mimetypes module uses mimetypes files present on the platform to add to the built in list of mimetypes. In this case, "video/x-mastroska" and ".mkv" are not found in the mimetypes module and were never there - they are coming from the host OS. Also, for better or worse, the mimetypes module has an internal "init" method that does more than just instantiates a MimeTypes instance for default use: https://github.com/python/cpython/blob/5c0c325453a175350e3c18ebb10cc10c37f9595c/Lib/mimetypes.py#L345 It also loads in these system files (and also Windows Registry entries on Win32) into a fresh MimeTypes instance. So, addressing what @The Compiler is seeing, properly resetting the mimetypes module really involves calling mimetypes.init(). By historical design, instantiating a MimeTypes class instance directly will not use host OS system mime type files. As to why this commit is causing a change in the observed behavior, the problem that was corrected in this commit was that the mimetypes module had non-deterministic behavior related to initialization. In the original init code, the module level mime types tables are changed (really corrupted) after first load and you can never reinitialize the module back to a known good state (i.e. to original module defaults without information from the host OS system). So, realistically, the behavior currently observed is the correct behavior given the presence and historical nature of the init function. The fact that a fresh MimeTypes instance without having been init()'d or with no filenames provided is returning an OS entry prior to this commit is really part of the initialization bug which was fixed. Regarding the ranger bug, the main thing is you should not use a MimeTypes instance directly unless you run it through the same initializations that the init code does. Anyway, that's my perspective having waded through all of that during the original BPO. I don't claim it's the correct one but that's where we are at. -- ___ Python tracker <https://bugs.python.org/issue38656> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue38656] mimetypes for python 3.7.5 fails to detect matroska video
David K. Hess added the comment: The documentation you quoted does read to me as compatible? The database it is referring to is the one hardcoded in the module – not the one assembled from that and the host OS. But, maybe this is just the vagaries of language and perspective at play. Anyway I do agree it is an unexpected behavior change from the perspective of a user of the MimeTypes class directly. To get the best context for this change, it's useful to run through the long history of the issue that drove it: https://bugs.python.org/issue4963 Note, that discussion never touched on the use case of instantiating a MimeTypes class directly and there are apparently no test cases covering this particular scenario either. With no awareness of this perspective/use case it didn't get directly addressed. Perhaps all MimeTypes instances should auto-load system files unless a new __init__ param selects for this new "clean" behavior? -- ___ Python tracker <https://bugs.python.org/issue38656> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue40139] mimetypes module racy
David K. Hess added the comment: I’m not sure I can shed any light on this particular bug, but I would say that based on my dealings with this module, it is definitely not thread-safe. That means that if you are going to have multiple threads accessing it simultaneously, you really should have a mutex around that access ensuring only one thread is running through the code in this module at a time. Now in reality, asyncio and other cooperatively scheduled multi-processing packages like gevent are not going to unpredictably yield control to another thread like true threads will. So, in this particular case, since the init code doesn’t use async or await, I don’t think there is a chance of an initialization race bug there. As to the bug witnessed, the only thing I can suggest is to add a considerable amount of debugging that logs the argument to guess_type and prints out the mimetype module’s internal state if and when this happens again. My best guess based on the amount of work that method does to inspect the passed in url, is that it has something to do with the url itself. -- ___ Python tracker <https://bugs.python.org/issue40139> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue38656] mimetypes for python 3.7.5 fails to detect matroska video
David K. Hess added the comment: @michael-lazar a documentation change seems the path of least resistance given the complicated history of this module. +1 from me. -- ___ Python tracker <https://bugs.python.org/issue38656> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue4963] mimetypes.guess_extension result changes after mimetypes.init()
David K. Hess added the comment: Thank you Steve! Nice to see this one make it across the finish line. -- ___ Python tracker <https://bugs.python.org/issue4963> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue4963] mimetypes.guess_extension result changes after mimetypes.init()
Changes by David K. Hess : -- pull_requests: +3096 ___ Python tracker <http://bugs.python.org/issue4963> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue4963] mimetypes.guess_extension result changes after mimetypes.init()
David K. Hess added the comment: FYI, PR opened: https://github.com/python/cpython/pull/3062 -- ___ Python tracker <http://bugs.python.org/issue4963> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue4963] mimetypes.guess_extension result changes after mimetypes.init()
David K. Hess added the comment: Ok, I followed @r.david.murray's advice and decided to take a shot at this. First, I noticed that I couldn't reproduce the non-deterministic behavior that I reported above on the latest code (i.e. pre-3.7). After doing some research it appears this was the sequence of events: 1) Pre-3.3, hashing was stable and this wasn't a problem. 2) Hash randomization became the default in version 3.3 and this non-determinism showed up. 3) A new dict implementation was introduced in 3.6 and key orders became stable between runs and this non-determinism was gone. However, as the notes on the new dict implementation indicate, this ordering should not be relied upon. I also looked at some other issues: * 6626 - The patch here basically rewrote the module. I agreed with the last comment on that issue that it probably doesn't need that. * 24527 - Related to the .init() problems discussed here in r.david.murray's excellent analysis of the init behavior. * 1043134 - Where the preferred extension issue was addressed via a proposed new map. My approach with this patch is to address the init problem, the non-determinism and the preferred extension issue. For the init, I made two changes: 1) I added new references to the initial values of the maps so they could be retained between init() calls. I also modified MimeTypes.__init__ to refer to these. 2) I modified the init() function to check the files argument as r.david.murray suggested. If it is supplied, then the existing database is used and the files are added to it. If it is not supplied, then the module reinitializes from scratch. I'll update the documentation to reflect this if the commit passes muster. For the non-determinism and preferred extension, I changed the two extension type maps to be OrderedDicts. I then sorted the entries to the OrderedDict constructor by mime type and then placed the preferred extension as the first extension to be processed. This guarantees that it will be the extension returned for guess_type. The OrderedDict also guarantees that guess_all_extensions will always build and return the same value. The commit can be reviewed here: https://github.com/davidkhess/cpython/commit/ecabb1cb57e7e066a693653f485f2f687dcc7f6b I'll open a PR if and when this approach gets enough positive feedback. -- ___ Python tracker <http://bugs.python.org/issue4963> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue4963] mimetypes.guess_extension result changes after mimetypes.init()
David K. Hess added the comment: Pushed more commits so here's a branch compare: https://github.com/python/cpython/compare/master...davidkhess:fix-issue-4963 -- ___ Python tracker <http://bugs.python.org/issue4963> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue4963] mimetypes.guess_extension result changes after mimetypes.init()
David K. Hess added the comment: Are there any committers watching this issue that are able to review the PR? https://github.com/python/cpython/pull/3062 It's close to 6 months old now with no action on it. I'm willing to help but doing so and then having the PR gather dust is pretty discouraging. Thanks in advance! -- ___ Python tracker <https://bugs.python.org/issue4963> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com