[issue41035] zipfile.Path does not work properly with zip archives where paths start with /
New submission from sorrow : I encountered errors when I had to work with ZPI file where path start with "/" -- components: Library (Lib) messages: 371880 nosy: sorrow priority: normal severity: normal status: open title: zipfile.Path does not work properly with zip archives where paths start with / type: behavior versions: Python 3.8 ___ Python tracker <https://bugs.python.org/issue41035> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue41035] zipfile.Path does not work properly with zip archives where paths start with /
sorrow added the comment: >>> import zipfile >>> import io >>> data = io.BytesIO() >>> zf = zipfile.ZipFile(data, 'w') >>> zf.writestr('/a.txt', 'content of a') >>> zf.filename = 'abcde.zip' >>> root = zipfile.Path(zf) >>> list(root.iterdir()) [] >>> root.exists() False -- ___ Python tracker <https://bugs.python.org/issue41035> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue41035] zipfile.Path does not work properly with zip archives where paths start with /
sorrow added the comment: >It seems you may have discovered a use-case that violates that expectation, a >case where `/a.txt` is identical to `a.txt`. The thing is: it's not. >Can you tell me more about your use-case and why zipp.Path/zipfile.Path should >support it? I received a .zip file and zipfile.Path in my code didn't work with it. I did some testing and discovered that id does not work properly with these archives. It cannot list the contents of such archive. >Is this behavior a result of a real-world example Yes, it is. >(please share details about the origin) I can't. First, the origin of this archive is not totally clear to me (and I do not want to investigate). And second, I'm under NDA. I provided minimal example where archive created with zipfile.ZipFile itself reproduces this behaviour. Just prerpend all paths with / an it does not work. -- ___ Python tracker <https://bugs.python.org/issue41035> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue41035] zipfile.Path does not work properly with zip archives where paths start with /
sorrow added the comment: >how common is this use-case I don't know about that. I just know that I have to make my program work with these files. Asking the clients to stop using this (presumably incorrect) format (or the program that makes it) is not an option. >It appears the file your client has sent and the minimal example you've >generated represents an invalid zip file. Well, I didn't know that. But it doesn't change the matter for me. >What I recommend is that you develop a subclass of zipfile.Path that supports >the abnormal format, use that for your work, and publish it (perhaps here, >perhaps as a library) for others with the same problem to use. I guess I'll have to go this way. >It opens up lots of questions, like: I'll share my considerations about how I see it should be implemented. >- should `at` include the leading slash? I think it doesn't matter, because `at` property is not part of `Path` public AP. Ideally, the files with leading slashes and without them should be treated exactly the same (as if they didn't have the leading slash). So, the only thing I'm concerned with is `at` argument of `Path.__init__`. It shouldn't require leading slash in any case. >- should the class support zip files with mixed leading and non-leading >slashes? No, this is definitely an error and shouldn't work. >- at what point does `Path` become aware of the format used? When the class is created? >- are there emergent performance concerns? For me - no, there aren't. I don't know how this kind of questions is resolved in python community though. >Can we close this as invalid? I guess you can. -- ___ Python tracker <https://bugs.python.org/issue41035> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue41035] zipfile.Path does not work properly with zip archives where paths start with /
sorrow added the comment: >`Path` public AP. API of course -- ___ Python tracker <https://bugs.python.org/issue41035> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue41035] zipfile.Path does not work properly with zip archives where paths start with /
sorrow added the comment: >When the class is created? I mean the class instance -- ___ Python tracker <https://bugs.python.org/issue41035> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue41035] zipfile.Path does not work properly with zip archives where paths start with /
sorrow added the comment: Here's what I came up with: ```python class ZipPath(zipfile.Path): def __init__(self, root, at=""): super().__init__(root, at) if not at.startswith("/") and self.root.namelist()[0].startswith("/"): self.at = f"/{at}" def __repr__(self): return ( f"{self.__class__.__name__}({self.root.filename!r}, " f"{self.at.lstrip('/')!r})" ) def __str__(self): return posixpath.join(self.root.filename, self.at.lstrip("/")) def _is_child(self, path): return posixpath.dirname(path.at.strip("/")) == self.at.strip("/") def _next(self, at): return self.__class__(self.root, at) ``` Pretty simple. The main things are going on in `__init__` and `_is_child` methods. These changes are enough for `iteritems()` to work. I decided to include the leading slash in the `at`, but strip it for the outside world (__str__ and __repr__). Also, I had to override the `_next` method because it makes `Path` objects no matter what class it's called from. -- ___ Python tracker <https://bugs.python.org/issue41035> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue41035] zipfile.Path does not work properly with zip archives where paths start with /
sorrow added the comment: >`iteritems()` I meant `iterdir()` of course. -- ___ Python tracker <https://bugs.python.org/issue41035> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com