On Wednesday, 28 April 2021 06:52:30 BST Glyph wrote: > > > On Apr 27, 2021, at 8:58 PM, Wim Lewis <w...@hhhh.org> wrote: > > > > On Thursday, April 8, 2021 8:43:35 AM PDT, Barry Scott wrote: > >> We just added a patch to our twisted to prevent twisted from doing idna > >> validation. > >> _idnaBytes and _idnaText not convert from bytes to unicode based on the > >> type of > >> the provided arg. > >> > >> We had to do this because there are domain names that youtube.com uses > >> that are > >> not valid under IDNA-2008 > >> https://tools.ietf.org/html/rfc5891#section-4.2.3.1 > > > > My reading of the RFC is that the YouTube domain you mention > > (r2---sn-aigzrn7e.googlevideo.com) is an invalid "U-Label", but that > > doesn't mean it's an entirely invaid domain label. It just means you can't > > legally run it through IDNA and turn it into "xn--r2---sn-aigzrn7e-". The > > intent, as I understand it, is to forbid any possibility of double-encoding > > or double-decoding a label, not to forbid the possibility of using labels > > like the one you mention. > > I agree with this reading. > > >> I can see why a UI would need to do IDNA-2008 converts and validation > >> but I'm not clear why its of value deep in the guts of twisted. > > > > My guess is that this is just an accident of the way that the > > bytes/characters distinction and the IDNA features were added to Twisted, > > and is probably a bug. > > +1. > > We also have other issues with the Python IDNA library: > https://github.com/kjd/idna/issues/18 <https://github.com/kjd/idna/issues/18> > and would generally like to reduce our strictness via whatever mechanisms we > can, even for things that genuinely require it (which this does not). > > >> Why is this code needed at all in twisted? > >> If its for a high level API then why isn't it being called at the > >> edge of the high level API calls? > > > > I'd argue that resolving URLs is in fact a high level API (from the point > > of view of the name resoution system) but even so, it seems to me that > > Twisted is doing the wrong thing here. The format of that label should > > prevent it from ever being transformed by IDNA, but shouldn't prevent it > > from being passed through unchanged, since it doesn't contain any > > codepoints outside of the usual ASCII range. > > Also agreed with all of this. > > >> The key idea here is that its human input that will be converted. > >> But the code is used deep in the _sslverify.py where no human > >> input is entered. > > > > _sslverify has to check whether the information in the server's certificate > > matches the URL that the user supplied. Certificates can contain Unicode > > text — at least in the (completely obsolete) CN-as-domain-name situation — > > so _sslverify probably picked up the requirement for IDNA transformations > > from that. (I don't remember whether dNSName SANs can contain unicode.) > > Yep. > > > What is the patch you decided to add to your version? Where in _sslverify > > did the problem surface?
When _idaBytes was called to raise an exception in ClientTLSOptions.__init__. > I am also very curious about this :). Attached is the patch we are using. We are using 19.07 for sad reasons. Barry
Only in tmp2: twisted-remove-idna-checks.patch diff -r -u tmp1/twisted-twisted-19.7.0/src/twisted/internet/_idna.py tmp2/twisted-twisted-19.7.0/src/twisted/internet/_idna.py --- tmp1/twisted-twisted-19.7.0/src/twisted/internet/_idna.py 2019-07-28 10:17:29.000000000 +0100 +++ tmp2/twisted-twisted-19.7.0/src/twisted/internet/_idna.py 2021-04-08 14:38:04.449987636 +0100 @@ -22,14 +22,10 @@ @return: The domain name's IDNA representation, encoded as bytes. @rtype: L{bytes} """ - try: - import idna - except ImportError: - return text.encode("idna") + if type(text) == bytes: + return text else: - return idna.encode(text) - - + return text.encode('ascii') def _idnaText(octets): """ @@ -43,12 +39,10 @@ @return: A human-readable domain name. @rtype: L{unicode} """ - try: - import idna - except ImportError: - return octets.decode("idna") + if type(octets) == bytes: + return octets.decode('ascii') else: - return idna.decode(octets) + return octets
_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com https://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python