[issue41102] ZipFile.namelist() does not match the actual files in .zip file
Xiaolong Liu added the comment: Dear Andrei, It's about 63MB. I reshared it on Google Drive. Please check the following link. https://drive.google.com/file/d/1534MdIcGbXtMwYfuo2zeFxm6BVgHa4XX/view?usp=sharing Bruce / Xiaolong Liu / 刘小龙 From: Andrei Kulakov Date: 2021-08-14 20:49 To: liuxiaolong125 Subject: [issue41102] ZipFile.namelist() does not match the actual files in .zip file Andrei Kulakov added the comment: Xiaolong: The file no longer exists on the google drive. How big was the file? -- nosy: +andrei.avk ___ Python tracker <https://bugs.python.org/issue41102> ___ -- ___ Python tracker <https://bugs.python.org/issue41102> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue41102] ZipFile.namelist() does not match the actual files in .zip file
Xiaolong Liu added the comment: Andrei: Exactly, different extraction tool gave different results. File no.tool platform 674the built-in tool on Win10win10 674winrarwin10 13997zipwin10 1399360zipwin10 674unzipUbuntu 20.10 13997zipUbuntu 20.10 -- ___ Python tracker <https://bugs.python.org/issue41102> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue41102] ZipFile.namelist() does not match the actual files in .zip file
Xiaolong Liu added the comment: Andrei: The zipped file was created by zipfile module of Python. That's the reason I posted it here. I achived more than 2000 files to the abnormal zipped file. And no tool can extract whole files archived within. 7zip got the first part, and other tools got the left. -- ___ Python tracker <https://bugs.python.org/issue41102> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue41102] ZipFile.namelist() does not match the actual files in .zip file
Xiaolong Liu added the comment: Serhiy: Thanks for your explanation. The file was created by zipfile module. I used the script hundreds of times, while only once (the uploaded zipped file) was abnormal. Since the project ended a long time ago, I cannot reproduce the error right now. I will post the related code snippet I used. import zipfile import pathlib def fileIsValid(filename): filename = pathlib.Path(filename) return True if filename.is_file() and filename.stat().st_size > 0 else False def compress2zip(sourceFile,zipFile,destinationFile): if not fileIsValid(zipFile): pathlib.Path(zipFile).parent.mkdir(parents=True, exist_ok=True) with zipfile.ZipFile(zipFile,'w',compression=zipfile.ZIP_DEFLATED) as myzip: myzip.write(sourceFile,destinationFile) dest_size = myzip.getinfo(destinationFile).file_size else: with zipfile.ZipFile(zipFile,'a',compression=zipfile.ZIP_DEFLATED) as myzip: if not destinationFile in myzip.namelist(): myzip.write(sourceFile,destinationFile) dest_size = myzip.getinfo(destinationFile).file_size source_size = pathlib.Path(sourceFile).stat().st_size if source_size == dest_size: print('Succeeded -- compress -- %s' % str(destinationFile)) return True else: print('Failed -- compress -- %s' %str(destinationFile)) return False files = list(pathlib.Path('000-original').glob('*.geojson')) zipFile = pathlib.Path('.zip') for file in files: comp_re = compress2zip(file, zipFile, file.name) -- ___ Python tracker <https://bugs.python.org/issue41102> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue41102] ZipFile.namelist() does not match the actual files in .zip file
Xiaolong Liu added the comment: It seems the indents were automatically removed in the message box. I shared the code snippet formmated here: https://www.online-python.com/cDojvl2CMS -- ___ Python tracker <https://bugs.python.org/issue41102> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue41102] ZipFile.namelist() does not match the actual files in .zip file
Xiaolong Liu added the comment: Andrei: Never mind. Yes. It is hardly impossible to sort out a problem when it cannot be reproduced. Just close it plz. Anyway, many thanks for your suggestions. -- ___ Python tracker <https://bugs.python.org/issue41102> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue41102] ZipFile.namelist() does not match the actual files in .zip file
Xiaolong Liu added the comment: Andrei: No multiprocessing or multithreading was used when creating the zip file. -- ___ Python tracker <https://bugs.python.org/issue41102> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue41102] ZipFile.namelist() does not match the actual files in .zip file
New submission from Xiaolong Liu : I used zipfile module to archive thousands of .geojson file to zip files and access those .geojson file by ZipFile.open() method. In my hundreds of runnings, one of them was abnormal. As the title says, the ZipFile.namelist() did not match all the files in .zip file. And I extracted it by extractall() method and it only got those files included in the namelist. On the other hand, I extracted it by my compress software (360zip). I got the other files unincluded in the namelist(). Only one file (2564.geojson) appeared with these two extract methods. ZipFile.extractall() method got 674 files from '2654.geojson' to '3989.geojson'. 360zip got 1399 files from '.geojson' to '2654.geojson'. The abnormal file is too big to upload this page and I uploaded to google drive: https://drive.google.com/file/d/1UE2N2qwjn4m7uE6YF2A1FhdXYHP_7zQr/view?usp=sharing -- components: Library (Lib) messages: 372247 nosy: alanmcintyre, longavailable, serhiy.storchaka, twouters priority: normal severity: normal status: open title: ZipFile.namelist() does not match the actual files in .zip file type: performance versions: Python 3.8 ___ Python tracker <https://bugs.python.org/issue41102> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com