[issue854511] Thai encoding alias for 'cp874'
era added the comment: Closing the entire enhancement request just because one detail is off seems insane. Anyway, until the day in the distant future when Python can support encoding names in common circulation, http://stackoverflow.com/a/1064191/874188 offers a crude workaround. import encodings if 'windows_874' not in encodings.aliases.aliases: encodings.aliases.aliases['windows_874'] = 'cp874' This is tricky in a number of ways; in practice, this snippet needs to be at the very start of your source file. Also, the underscore is correct even for email encoding names like =?windows-874?Q?hello=3F?= which use a dash (the dash gets remapped to underscore internally when looking up the encoding alias). -- nosy: +era ___ Python tracker <http://bugs.python.org/issue854511> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue35547] email.parser / email.policy does correctly handle multiple RFC2047 encoded-word tokens across RFC5322 folded headers
era added the comment: I don't think this is a bug. My impression is that encoded words should be decodable in isolation. -- nosy: +era ___ Python tracker <https://bugs.python.org/issue35547> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue36261] email examples should not gratuitously mess with preamble
New submission from era : Several of the examples in the email module documentation modify the preamble. This is not good practice. The email MIME preamble is really only useful for communicating information about MIME itself, not for general human-readable content like 'Our family reunion'. The MIME preamble is problematic because it typically only supports ASCII and often defaults to an English-language message, even when applications are used in locales where English is not widely understood. For this reason, it is moderately useful to be able to override the preamble from Python code; but this should by no means be done routinely, and the documentation should certainly not demonstrate this in basic examples. -- components: email messages: 337657 nosy: barry, era, r.david.murray priority: normal severity: normal status: open title: email examples should not gratuitously mess with preamble type: behavior versions: Python 3.7 ___ Python tracker <https://bugs.python.org/issue36261> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue34459] email.contentmanager should use IANA encoding
New submission from era : https://github.com/python/cpython/blob/3.7/Lib/email/contentmanager.py#L64 currently contains the following code: def get_text_content(msg, errors='replace'): content = msg.get_payload(decode=True) charset = msg.get_param('charset', 'ASCII') return content.decode(charset, errors=errors) This breaks when the IANA character set is not identical to the Python encoding name. For example, pass it a message with Content-type: text/plain; charset=cp-850 This breaks for two separate reasons (and I will report two separate bugs); the IANA character-set label should be looked up and converted to a Python codec name (that's this bug) and the character-set alias 'cp-850' is not defined in the lookup table in the place. There are probably other places in contentmanager.py where a similar mapping should take place. I do not have a proper patch, but in general outline, the fix would look like +import email.charset + def get_text_content(msg, errors='replace'): content = msg.get_payload(decode=True) charset = msg.get_param('charset', 'ASCII') - return content.decode(charset, errors=errors) + encoding = Charset(charset).output_charset() + return content.decode(encoding, errors=errors) This was discovered in this Stack Overflow post: https://stackoverflow.com/a/51961225/874188 -- components: email messages: 323869 nosy: barry, era, r.david.murray priority: normal severity: normal status: open title: email.contentmanager should use IANA encoding versions: Python 3.7 ___ Python tracker <https://bugs.python.org/issue34459> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue34460] email.charset: common IANA labels missing
New submission from era : The email.charset module should contain common informal character-set identifiers even if they are not formally specified in a IANA RFC. >From a quick grep of a pile of recent email, I find the following: 46 "cp-850" 6 "windows-874" For scale, the same collection contained around 10,000 messages with "utf-8" and 2,000 with "iso-8859-1". Still, the fact that there are multiple occurrences in a spool of recent messages indicates that they are fairly common. Currently, the email module throws a traceback if you attempt to parse a message whose character set is not known to Python. This is not possible to prevent in the general case, but making it more robust with encodings which are reasonably prevalent in the wild would definitely be desirable. For what it's worth, "cp-850" is apparently an alias for IBM code page 850 which is defined with the name "cp850" in RFC1345. "windows-874" is an official designation which is detailed in https://www.iana.org/assignments/charset-reg/windows-874 which is apparently equivalent to the Python codec "cp784". -- components: email messages: 323870 nosy: barry, era, r.david.murray priority: normal severity: normal status: open title: email.charset: common IANA labels missing versions: Python 3.6 ___ Python tracker <https://bugs.python.org/issue34460> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue34459] email.contentmanager should use IANA encoding
era added the comment: https://bugs.python.org/issue34460 now requests the addition of "cp-850" and "windows-784" as charset aliases in the email.charset module. -- ___ Python tracker <https://bugs.python.org/issue34459> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue17305] IDNA2008 encoding missing
era added the comment: At least the following existing domain names are rejected by the current implementation, apparently because they are not IDNA2003-compatible. XNNNC9BXA1KSA.COM XN--14-CUD4D3A.COM XN--YGB4AR5HPA.COM XN---14-00E9E9A.COM XN--MGB2DAM4BK.COM XN--6-ZHCPPA1B7A.COM XN--3-YMCCH8IVAY.COM XN--3-YMCLXLE2A3F.COM XN--4-ZHCJXA0E.COM XN--014-QQEUW.COM XN--118-Y2EK60DC2ZB.COM As a workaround, in the code where I needed to process these, I used a fallback to string[4:].decode('punycode'); this was in a code path where I had already lowercased the string and established that string[0:4] == 'xn--'. As a partial remedy, supporting a relaxed interpretation of the spec somehow would be useful; see also (tangentially) issue #12263. -- nosy: +era ___ Python tracker <http://bugs.python.org/issue17305> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1757072] Zipfile robustness
era added the comment: For those who cannot update just yet, see also the workaround at http://stackoverflow.com/a/21996397/874188 -- nosy: +era ___ Python tracker <http://bugs.python.org/issue1757072> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue22929] cp874 encoding almost empty
New submission from era: I created a simple script to map character codes in the 8bit range to Unicode for simple lookup: https://github.com/tripleee/8bit In the generated output, on Python 2.6.6 (but corroborated on Python 2.7.6), almost all character codes come up as "undefined" in CP874. According to http://en.wikipedia.org/wiki/ISO/IEC_8859-11, CP874 should be a superset of ISO-8859-11, with a few character codes *added* in the ISO control range. -- components: Unicode messages: 231596 nosy: era, ezio.melotti, haypo priority: normal severity: normal status: open title: cp874 encoding almost empty type: behavior versions: Python 2.7 ___ Python tracker <http://bugs.python.org/issue22929> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue22929] cp874 encoding almost empty
era added the comment: My apologies -- I already attemptd to close this as a mistake on my part, but apparently, that failed too. )-: Sorry. -- resolution: -> not a bug status: open -> closed ___ Python tracker <http://bugs.python.org/issue22929> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue17254] add thai encoding aliases to encodings.aliases
Changes by era : -- nosy: +era ___ Python tracker <http://bugs.python.org/issue17254> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue24430] ZipFile.read() cannot decrypt multiple members from Windows 7zfm
New submission from era: The attached archive from the Windows version of the 7z file manager (7zFM version 9.20) cannot be decrypted into memory. The first file succeeds, but the second one fails. The following small program is able to unzip other encrypted zip archives (tried one created by Linux 7z version 9.04 on Debian from the package p7zip-full, and one from plain zip 3.0-3 which comes from the InfoZip distribution, as well as a number of archives of unknown provenance) but fails on the attached one. from zipfile import ZipFile from sys import argv container = ZipFile(argv[1]) for member in container.namelist(): print("member %s" % member) try: extracted = container.read(member) print("extracted %s" % repr(extracted)[0:64]) except RuntimeError, err: extracted = container.read(member, 'hello') container.setpassword('hello') print("extracted with password 'hello': %s" % repr(extracted)[0:64]) Here is the output and backtrace: member hello/ extracted '' member hello/goodbye.txt Traceback (most recent call last): File "./nst.py", line 13, in extracted = container.read(member, 'hello') File "/usr/lib/python2.6/zipfile.py", line 834, in read return self.open(name, "r", pwd).read() File "/usr/lib/python2.6/zipfile.py", line 901, in open raise RuntimeError("Bad password for file", name) RuntimeError: ('Bad password for file', 'hello/goodbye.txt') The 7z command is able to extract it just fine: $ 7z -phello x /tmp/hello.zip 7-Zip 9.04 beta Copyright (c) 1999-2009 Igor Pavlov 2009-05-30 p7zip Version 9.04 (locale=en_US.UTF-8,Utf16=on,HugeFiles=on,1 CPU) Processing archive: /tmp/hello.zip Extracting hello Extracting hello/goodbye.txt Extracting hello/hello.txt Everything is Ok Folders: 1 Files: 2 Size: 15 Compressed: 560 -- files: hello.zip messages: 245165 nosy: era priority: normal severity: normal status: open title: ZipFile.read() cannot decrypt multiple members from Windows 7zfm Added file: http://bugs.python.org/file39680/hello.zip ___ Python tracker <http://bugs.python.org/issue24430> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue24430] ZipFile.read() cannot decrypt multiple members from Windows 7zFM
Changes by era : -- components: +Library (Lib) title: ZipFile.read() cannot decrypt multiple members from Windows 7zfm -> ZipFile.read() cannot decrypt multiple members from Windows 7zFM type: -> behavior ___ Python tracker <http://bugs.python.org/issue24430> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue24430] ZipFile.read() cannot decrypt multiple members from Windows 7zFM
era added the comment: The call to .setpassword() doesn't seem to make any difference. I was hoping it would offer a workaround, but it didn't. -- ___ Python tracker <http://bugs.python.org/issue24430> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue28122] email.header.decode_header can not decode string with quotation
era added the comment: The double quotes around the "human readable" part of the email address are not allowed. Python is handling this correctly. -- nosy: +era ___ Python tracker <http://bugs.python.org/issue28122> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue28577] ipaddress.ip_network(...).hosts() returns nothing for an IPv4 /32
New submission from era: I would expect the following code to return ['10.9.8.8'] but it returns an empty list. yosemite-osx$ python3 Python 3.5.1 (default, Dec 26 2015, 18:08:53) [GCC 4.2.1 Compatible Apple LLVM 7.0.2 (clang-700.1.81)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import ipaddress >>> list(ipaddress.ip_network('10.9.8.7/32').hosts()) [] This seems to happen for every /32 address. I'm guessing the logic which wants to exclude the gateway and broadcast addresses from a block should treat a /32 as a special case. I tried to look for a previous bug submission but I could not find one. As such, it seems peculiar if this has not been reported before. Is this actually expected behavior by some rule I am overlooking? I tested on Linux 3.4 and OSX Yosemite Homebrew / Python 3.5.1. -- components: Library (Lib) messages: 279855 nosy: era priority: normal severity: normal status: open title: ipaddress.ip_network(...).hosts() returns nothing for an IPv4 /32 versions: Python 3.4, Python 3.5 ___ Python tracker <http://bugs.python.org/issue28577> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue28577] ipaddress.ip_network(...).hosts() returns nothing for an IPv4 /32
era added the comment: (Meh, silly typo, of course the expected output is ['10.9.8.7'], sorry about that!) -- ___ Python tracker <http://bugs.python.org/issue28577> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue28577] ipaddress.ip_network(...).hosts() returns nothing for an IPv4 /32
era added the comment: @xiang.zhang thanks for the quick reply. I find this behavior surprising. If I process a list of addresses, like ips = ( '10.9.8.7/32' '10.11.12.8/28' ) for test in ['10.9.8.7', '10.11.12.10']: if test in [str(y) for x in ips for y in ipaddress.ip_network(x).hosts()]: print('{0} found'.format(test)) else: print('{0} not found'.format(test)) I would expect both addresses to print "found", but that's not how the current implementation works. I agree that the /28 should not include the gateway and broadcast addresses, but I would not expect the explicitly listed /32 address to completely disappear from the output. Are my expectations incorrect? For code like this, what should I use instead, if not hosts()? -- ___ Python tracker <http://bugs.python.org/issue28577> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue28577] ipaddress.ip_network(...).hosts() returns nothing for an IPv4 /32
era added the comment: Quick googling did not turn up anything like a credible authoritative reference for this, but in actual practice, I have seen /32 used to designate a single individual IP address in CIDR notation quite a lot. I can see roughly three options: 1. Status quo. Silently surprise users who expect this to work. 2. Silently fix. Hard-code /32 to return a range of one IP address. 3. Let users choose. Similarly to the "strict=True" keyword argument in the constructor method, the code could allow for either lenient or strict semantics. By the by, I don't see how the bug you linked to is relevant here, did you mistype the bug number? #27863 is about _elementtree -- ___ Python tracker <http://bugs.python.org/issue28577> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue27683] ipaddress subnet slicing iterator malfunction
era added the comment: #28577 requests a similar special case for /32 -- nosy: +era ___ Python tracker <http://bugs.python.org/issue27683> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com