[issue41928] ZipFile does not supports Unicode Path Extra Field (0x7075) zip header field

2020-10-04 Thread Ivan Sorokin
Ivan Sorokin added the comment: Grand unified algorithm to read filenames from zip files correctly: 1. Do zip entry have «Unicode Path Extra Field» (0x7075)? Use it for file name. 2. Is Unicode flag (0x800) set in «Flags» Field of zip entry? Assume «Filename» Field is in UTF-8. 3. Do «HostOS

[issue41929] Detect OEM code page for zip archives in ZipFile based on system locale

2020-10-04 Thread Ivan Sorokin
New submission from Ivan Sorokin : ZipFile has problems with filename charset in .zip archives having filenames charset encoded in OEM code page. ZipFile assumes that OEM code page always means "cp437". Actually many popular .zip packers (for example, Windows internal "zip fol

[issue41928] ZipFile does not supports Unicode Path Extra Field (0x7075) zip header field

2020-10-04 Thread Ivan Sorokin
New submission from Ivan Sorokin : See attached sample. Well-known unzip command line tool lists its contents correctly: $ unzip -l 23.zip Archive: 23.zip Length DateTimeName - -- - 81408 2012-10-23 19:03 Β' ΦΑΣΗ ΠΕ06 ΣΧΟΛΕΙΑ ΕΑΕΠ (ΙΝΤ