[issue44524] __name__ attribute in typing module
New submission from Lars : I noticed some (perhaps intentional) oddities with the __name__ attribute: - typing classes like Any (subclass of _SpecialForm) do not have a __name__ attribute, - abstract base classes in typing, like MutableSet do not have a __name__ attribute, - 'ChainMap', 'Counter', 'OrderedDict' do not have a __name__ attribute when imported from typing, but do when imported from collections. I have written a function to show presence/absence if the name __name__ attribute: def split_module_names(module): unnamed, named = set(), set() for name in dir(module): if not name.startswith('_'): attr = getattr(module, name) try: if hasattr(attr, '__name__'): named.add(name) else: unnamed.add(name) except TypeError: pass return named, unnamed import typing import collections typing_named, typing_unnamed = split_module_names(typing) collec_named, collec_unnamed = split_module_names(collections) print("typing_unnamed:", typing_unnamed) print("collec_named & typing_unnamed:", collec_named & typing_unnamed) Is this intentional? It seems a little inconsistent. I also found something that sometimes the __name__ attribute does resolve: class S(typing.Sized): def __len__(self): return 0 print("'Sized' in typing_unnamed:", 'Sized' in typing_unnamed) print("[t.__name__ for t in S.__mro__]:", [t.__name__ for t in S.__mro__]) # here __name__ is resolved! print("getattr(typing.Sized, '__name__', None):", getattr(typing.Sized, '__name__', None)) printing: 'Sized' in typing_unnamed: True [t.__name__ for t in S.__mro__]: ['S', 'Sized', 'Generic', 'object'] getattr(typing.Sized, '__name__', None): None -- messages: 396638 nosy: farcat priority: normal severity: normal status: open title: __name__ attribute in typing module type: behavior versions: Python 3.9 ___ Python tracker <https://bugs.python.org/issue44524> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue44524] __name__ attribute in typing module
Lars added the comment: I have been doing some research, but note that I don't have much experience with the typing module. That said, there seem to be 2 main cases: - '_SpecialForm': with instances Any, Union, etc. - '_BaseGenericAlias'/'_SpecialGenericAlias': base classes collections classes. I think '_SpecialForm' can be enhanced to have '__name__' by replacing the '_name' attribute with '__name__'. Maybe add '__qualname__' as well. I cannot say whether there are many more attributes that could be implemented to have the same meaning as in 'type'. The meaning of attributes like '__mro__' seem difficult to define. Alternatively '__getattr__' could be added (but that might be too much): def __getattr__(self, attr): return getattr(self._getitem, attr) '_BaseGenericAlias''_SpecialGenericAlias' the '__getattr__' method could perhaps be adapted (or overridden in '_SpecialGenericAlias') as follows, from: def __getattr__(self, attr): # We are careful for copy and pickle. # Also for simplicity we just don't relay all dunder names if '__origin__' in self.__dict__ and not _is_dunder(attr): return getattr(self.__origin__, attr) raise AttributeError(attr) to: def __getattr__(self, attr): if '__origin__' in self.__dict__: return getattr(self.__origin__, attr) raise AttributeError(attr) or perhaps: def __getattr__(self, attr): if '__origin__' in self.__dict__ and hasattr(type, attr): return getattr(self.__origin__, attr) raise AttributeError(attr) to forward unresolved attribute names to the original class. I have written some tools and tested some with the above solutions and this seems to solve the missing '__name__' issue and make the typing abc's much more in line with the collection abc's. However I did not do any unit/regression testing (pull the repo, etc.) tools are attached. -- Added file: https://bugs.python.org/file50135/typing_attributes.py ___ Python tracker <https://bugs.python.org/issue44524> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue44524] __name__ attribute in typing module
Lars added the comment: Happy to see progress on this issue and I can see that adding these attributes to the ABC's in typing makes the most sense. However for my direct use-case (simplified: using Any in a type checking descriptor) it would be really practical to have the __name__ (and perhaps __qualname__ and __module__) attributes in the Any type. This is mainly for consistent logging/printing purposes. Since Any already has a _name attribute, changing this to __name__ might achieve this. -- ___ Python tracker <https://bugs.python.org/issue44524> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue40560] uuid.uuid4().hex not random
New submission from Lars : Hi everybody I just noticed that the uuid.uuid4().hex command does not create fully random hex values. The character on the 13th position is always 4 and the 17th position is equally distributed 8,9,a,b. One million uuids follow following distribution. {'0': 62312, '1': 62626, '2': 62308, '3': 62801, '4': 62173, '5': 62622, '6': 62527, '7': 62310, '8': 62574, '9': 62314, 'a': 62575, 'b': 62472, 'c': 62500, 'd': 62229, 'e': 62813, 'f': 62844} {'0': 62239, '1': 63002, '2': 62551, '3': 62601, '4': 62075, '5': 62314, '6': 62584, '7': 62184, '8': 62359, '9': 62637, 'a': 63100, 'b': 62782, 'c': 62097, 'd': 62359, 'e': 62487, 'f': 62629} {'0': 62647, '1': 62649, '2': 62924, '3': 62555, '4': 62683, '5': 62435, '6': 62403, '7': 63010, '8': 62235, '9': 62412, 'a': 62320, 'b': 62081, 'c': 62795, 'd': 62329, 'e': 62420, 'f': 62102} {'0': 62641, '1': 62772, '2': 62458, '3': 62483, '4': 62201, '5': 62564, '6': 62307, '7': 62822, '8': 62102, '9': 62284, 'a': 62561, 'b': 62749, 'c': 62264, 'd': 62732, 'e': 62445, 'f': 62615} {'0': 62433, '1': 62815, '2': 62761, '3': 62355, '4': 62526, '5': 62464, '6': 62494, '7': 62116, '8': 62813, '9': 62556, 'a': 62722, 'b': 62440, 'c': 62634, 'd': 61967, 'e': 62425, 'f': 62479} {'0': 62544, '1': 62573, '2': 62774, '3': 62143, '4': 62814, '5': 62144, '6': 62207, '7': 62605, '8': 62567, '9': 62689, 'a': 62500, 'b': 62631, 'c': 62460, 'd': 62156, 'e': 62613, 'f': 62580} {'0': 62707, '1': 62315, '2': 62698, '3': 62260, '4': 62634, '5': 62145, '6': 62358, '7': 62725, '8': 61971, '9': 62559, 'a': 62341, 'b': 62846, 'c': 62650, 'd': 62527, 'e': 62712, 'f': 62552} {'0': 62477, '1': 62571, '2': 62672, '3': 62207, '4': 62798, '5': 62338, '6': 62381, '7': 62490, '8': 62478, '9': 62434, 'a': 62391, 'b': 62397, 'c': 62870, 'd': 62550, 'e': 62679, 'f': 62267} {'0': 62238, '1': 62361, '2': 62895, '3': 62525, '4': 62799, '5': 62763, '6': 62422, '7': 62621, '8': 62446, '9': 62160, 'a': 62636, 'b': 62601, 'c': 62331, 'd': 62342, 'e': 62156, 'f': 62704} {'0': 62668, '1': 62824, '2': 61820, '3': 62839, '4': 62107, '5': 62527, '6': 62497, '7': 62287, '8': 62881, '9': 62455, 'a': 62742, 'b': 62590, 'c': 62278, 'd': 62419, 'e': 62550, 'f': 62516} {'0': 62889, '1': 62561, '2': 62428, '3': 62696, '4': 63173, '5': 62220, '6': 62831, '7': 62762, '8': 62267, '9': 62065, 'a': 62737, 'b': 62064, 'c': 62520, 'd': 62593, 'e': 61960, 'f': 62234} {'0': 62441, '1': 62602, '2': 62799, '3': 62707, '4': 62200, '5': 62562, '6': 62359, '7': 62760, '8': 62530, '9': 62726, 'a': 62210, 'b': 62299, 'c': 62068, 'd': 62702, 'e': 62551, 'f': 62484} {'0': 0, '1': 0, '2': 0, '3': 0, '4': 100, '5': 0, '6': 0, '7': 0, '8': 0, '9': 0, 'a': 0, 'b': 0, 'c': 0, 'd': 0, 'e': 0, 'f': 0} {'0': 62660, '1': 62302, '2': 62128, '3': 62184, '4': 62560, '5': 62455, '6': 628
[issue40560] uuid.uuid4().hex not random
Lars added the comment: Ok that makes sense. Thanks for letting me know. Should have read the doku more precisely. -- resolution: -> not a bug stage: -> resolved status: open -> closed ___ Python tracker <https://bugs.python.org/issue40560> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue672115] Assignment to __bases__ of direct object subclasses
Lars added the comment: in my project i need to be able to let the user dynamically make and remove inheritance relationships between classes and in my testing i think i have run into this issue assigning to __bases__. the class object(object): pass trick seems to work, but i can't really oversee the consequenses. I also saw another variation which might be the same issue: A= type("A", (object,), {'one': 1}) B= type("B", (object,), {'two': 2}) C= type("C", (object,), {'three': 3}) A = type("A",(A,B),{}) print dir(A) print A.__bases__ print '---' A.__bases__ = (B,C) print dir(A) print A.__bases__ print '---' no exceptions, but the second dir(A) shows that A has lost its attribute 'one' if the class object(object) trick is not safe, is there a way to get the dynamic inheritance behaviour in another way, e.g. through metaclasses? -- nosy: +farcat ___ Python tracker <http://bugs.python.org/issue672115> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue672115] Assignment to __bases__ of direct object subclasses
Lars added the comment: ok, i see what you mean, for me at this time the most important question is whta does class object(object) pass do, why can i change baseclasses after i redeclare object this way, and will it get me into trouble when i use this to let users dynamically define classes and inheritence relationships? cheers -- ___ Python tracker <http://bugs.python.org/issue672115> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1276378] tarfile: adding filed that use direct device addressing
Lars Gustäbel added the comment: Closing this due to lack of interest. This is no tarfile issue. If direct device addressing should be supported by Python, os.stat() would be the place to implement it. -- resolution: -> wont fix status: open -> closed _ Tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue1276378> _ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1044] tarfile insecure pathname extraction
New submission from Lars Gustäbel: tarfile does not check pathnames or linknames on extraction. This can lead to data loss or attack scenarios when members with absolute pathnames or pathnames outside of the archive's scope overwrite or overlay existing files or directories. Example for a symlink attack against /etc/passwd: foo -> /etc foo/passwd -- assignee: lars.gustaebel components: Library (Lib) files: insecure_pathnames.diff keywords: patch messages: 55361 nosy: lars.gustaebel, matejcik priority: normal severity: normal status: open title: tarfile insecure pathname extraction type: security versions: Python 2.6 __ Tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue1044> __Index: Doc/library/tarfile.rst === --- Doc/library/tarfile.rst (revision 57571) +++ Doc/library/tarfile.rst (working copy) @@ -168,6 +168,12 @@ :attr:`TarFile.errorlevel`\ ``== 2``. +.. exception:: SecurityError + + Is a subclass of :exc:`ExtractError` and is raised when insecure pathnames + are found on extraction. + + .. exception:: HeaderError Is raised by :meth:`frombuf` if the buffer it gets is invalid. @@ -327,16 +333,22 @@ available. -.. method:: TarFile.extractall([path[, members]]) +.. method:: TarFile.extractall([path[, members[, check_paths]]]) Extract all members from the archive to the current working directory or - directory *path*. If optional *members* is given, it must be a subset of the - list returned by :meth:`getmembers`. Directory information like owner, - modification time and permissions are set after all members have been extracted. - This is done to work around two problems: A directory's modification time is - reset each time a file is created in it. And, if a directory's permissions do - not allow writing, extracting files to it will fail. + directory *path*. If optional *members* is given, it must be an iterator of + :class:`TarInfo` objects (e.g. a subset of the list returned by + :meth:`getmembers`). If *check_paths* is :const:`True` (default), insecure + pathnames that are absolute or point to a destination outside of the + archive's scope are rejected. Depending on :attr:`TarFile.errorlevel` a + :exc:`SecurityError` is raised. + Directory information like owner, modification time and permissions are set + after all members have been extracted. This is done to work around two + problems: A directory's modification time is reset each time a file is + created in it. And, if a directory's permissions do not allow writing, + extracting files to it will fail. + .. versionadded:: 2.5 @@ -349,9 +361,8 @@ .. note:: - Because the :meth:`extract` method allows random access to a tar archive there - are some issues you must take care of yourself. See the description for - :meth:`extractall` above. + The :meth:`extract` method does not take care of several extraction issues. + In most cases you should consider using the :meth:`extractall` method. .. method:: TarFile.extractfile(member) Index: Lib/tarfile.py === --- Lib/tarfile.py (revision 57571) +++ Lib/tarfile.py (working copy) @@ -340,6 +340,9 @@ class ExtractError(TarError): """General exception for extract errors.""" pass +class SecurityError(ExtractError): +"""Exception for insecure pathnames.""" +pass class ReadError(TarError): """Exception for unreadble tar archives.""" pass @@ -2006,12 +2009,13 @@ self.members.append(tarinfo) -def extractall(self, path=".", members=None): +def extractall(self, path=".", members=None, check_paths=True): """Extract all members from the archive to the current working directory and set owner, modification time and permissions on directories afterwards. `path' specifies a different directory to extract to. `members' is optional and must be a subset of the - list returned by getmembers(). + list returned by getmembers(). If `check_paths' is True insecure + pathnames are not extracted. """ directories = [] @@ -2019,6 +2023,20 @@ members = self for tarinfo in members: +if check_paths: +try: +self._check_path(tarinfo.name) +if tarinfo.islnk(): +self._check_path(tarinfo.linkname) +if tarinfo.issym(): +self._check_path(os.path.join(tarinfo.name, tarinfo.linkname))
[issue1044] tarfile insecure pathname extraction
Lars Gustäbel added the comment: In principle I do not object, but this is a preliminary patch. I am still not happy with the naming of the "check_paths" argument. Also, the patch was made against the trunk which means that it contains hunks with the new reStructuredText documentation. Please be patient. I do not change extract() because it has become more and more a low-level method over the years, that makes promises it cannot keep and should not be used at all. I try to discourage its use in the documentation. __ Tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue1044> __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1059] *args and **kwargs in function definitions
New submission from Lars Gustäbel: For example in tarfile.rst and logging.rst there are function definitions using *args and/or **kwargs like: .. function:: debug(msg[, *args[, **kwargs]]) The * and ** should be escaped IMO, so that they are not mistaken as reStructuredText markup, which confuses the syntax coloring of my Vim. While escaping * with a backslash works fine in normal text, it does not work in a function definition and the backslash appears in the HTML output. -- assignee: georg.brandl components: Documentation messages: 55434 nosy: lars.gustaebel priority: low severity: minor status: open title: *args and **kwargs in function definitions versions: Python 2.6 __ Tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue1059> __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1044] tarfile insecure pathname extraction
Lars Gustäbel added the comment: After careful consideration and a private discussion with Martin I do no longer think that we have a security issue here. tarfile.py does nothing wrong, its behaviour conforms to the pax definition and pathname resolution guidelines in POSIX. There is no known or possible practical exploit. I update the documentation with a warning, that it might be dangerous to extract archives from untrusted sources. That is the only thing to be done IMO. -- type: security -> behavior __ Tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue1044> __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1044] tarfile insecure pathname extraction
Lars Gustäbel added the comment: I updated the documentation, r57764 (trunk) and r57765 (2.5). -- resolution: -> works for me status: open -> closed __ Tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue1044> __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1119] Search index is messed up after partial rebuilding
New submission from Lars Gustäbel: When rebuilding parts of the documentation the search index is emptied. The problem is that the extensions are not stripped from the filenames that are given to IndexBuilder.prune() method. Therefore, the Search widget on http://docs.python.org/dev/3.0/ produces no results. -- components: Documentation tools (Sphinx) files: sphinx-index.diff keywords: patch messages: 55689 nosy: lars.gustaebel severity: normal status: open title: Search index is messed up after partial rebuilding __ Tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue1119> __Index: builder.py === --- builder.py (revision 58006) +++ builder.py (working copy) @@ -498,7 +498,7 @@ except (IOError, OSError): pass # delete all entries for files that will be rebuilt -self.indexer.prune(set(self.env.all_files) - set(filenames)) +self.indexer.prune([fn[:-4] for fn in set(self.env.all_files) - set(filenames)]) def index_file(self, filename, doctree, title): # only index pages with title ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12800] 'tarfile.StreamError: seeking backwards is not allowed' when extract symlink
Changes by Lars Gustäbel : -- assignee: -> lars.gustaebel ___ Python tracker <http://bugs.python.org/issue12800> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5689] Support xz compression in tarfile module
Lars Gustäbel added the comment: Attached is a patch with the current state of my work on lzma integration into tarfile (17 test errors). -- assignee: -> lars.gustaebel keywords: +patch Added file: http://bugs.python.org/file23162/2011-09-15-tarfile-lzma.diff ___ Python tracker <http://bugs.python.org/issue5689> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue6715] xz compressor support
Lars Gustäbel added the comment: Today I played around with lzma support for tarfile based on your last patch (see issue5689). There are a few minor issues that I just wanted to mention, as they break the tarfile testsuite: - LZMAFile does not expose a name attribute. BZ2File doesn't either (not in 3.x anyway), but GzipFile does. - LZMAFile does not allow a 'b' in the mode argument, unlike GzipFile and BZ2File. - The bz2 module exposes many error conditions as standard Python exceptions, e.g. IOError, EOFError. The lzma module uses LZMAError for all errors without distinction. -- ___ Python tracker <http://bugs.python.org/issue6715> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13031] [PATCH] small speed-up for tarfile.py when unzipping tarballs
Changes by Lars Gustäbel : -- assignee: -> lars.gustaebel nosy: +lars.gustaebel priority: normal -> low versions: +Python 3.3 -Python 2.7, Python 3.2 ___ Python tracker <http://bugs.python.org/issue13031> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13121] collections.Counter's += copies the entire object
New submission from Lars Buitinck : I've found some counterintuitive behavior in collections.Counter while hacking on the scikit-learn project [1]. I wanted to use a bunch of Counters to do some simple term counting in a set of documents, roughly as follows: count_total = Counter() for doc in documents: count_current = Counter(analyze(doc)) count_total += count_current count_per_doc.append(count_current) Performance was horrible. After some digging, I found out that Counter [2] does not have __iadd__ and += copies the entire left-hand side in __add__. I've attached a patch that fixes the issue (for += only, and I've not patched the testsuite.) [1] https://github.com/scikit-learn/scikit-learn/commit/de6e93094499e4d81b8e3b15fc66b6b9252945af -- components: Library (Lib) files: cpython-counter-iadd.diff keywords: patch messages: 145063 nosy: larsmans priority: normal severity: normal status: open title: collections.Counter's += copies the entire object type: behavior versions: Python 3.4 Added file: http://bugs.python.org/file23336/cpython-counter-iadd.diff ___ Python tracker <http://bugs.python.org/issue13121> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13121] collections.Counter's += copies the entire object
Lars Buitinck added the comment: If this is not implemented because it is backwards incompat, then it might be useful to add a note to update's docstring explaining that it is much more efficient than +=. I was very surprised that it took *minutes* to add a few thousand moderate-sized Counters. -- ___ Python tracker <http://bugs.python.org/issue13121> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13158] tarfile.TarFile.getmembers misses some entries
Changes by Lars Gustäbel : -- assignee: -> lars.gustaebel nosy: +lars.gustaebel versions: +Python 3.3 ___ Python tracker <http://bugs.python.org/issue13158> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13158] tarfile.TarFile.getmembers misses some entries
Lars Gustäbel added the comment: Thanks for the report. There was a problem decoding a special and rare kind of header field in the archive. The format of the archive is of very bad quality BTW ;-) -- resolution: -> fixed stage: -> committed/rejected status: open -> closed ___ Python tracker <http://bugs.python.org/issue13158> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13407] tarfile.getnames misses members again
Lars Gustäbel added the comment: Some testing reveals that the bz2 module < 3.3 cannot fully decompress the file in question. Only the first 900k are decompressed. Thus, this issue is not related to issue13158 or the tarfile module. -- nosy: +lars.gustaebel ___ Python tracker <http://bugs.python.org/issue13407> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13477] tarfile module should have a command line
Lars Gustäbel added the comment: This is no bad idea. I recommend keeping it as simple as possible. I would definitely not be supportive of a full tar clone. List, extract, create - that should be enough. There are two possible command line choices: do what the zipfile module does or emulate tar. I am in favor of the latter. -- assignee: -> lars.gustaebel priority: normal -> low stage: test needed -> needs patch ___ Python tracker <http://bugs.python.org/issue13477> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5689] Support xz compression in tarfile module
Lars Gustäbel added the comment: I will be happy to, but my spare time is limited right now, so this could take about a week. If this is a problem, please go ahead. -- ___ Python tracker <http://bugs.python.org/issue5689> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5689] Support xz compression in tarfile module
Lars Gustäbel added the comment: For those who want to test it first, I post the current state of the patch here. It is ready for commit, there are no failing tests. If nobody objects, I will apply it this weekend. -- Added file: http://bugs.python.org/file23880/2011-12-08-tarfile-lzma.diff ___ Python tracker <http://bugs.python.org/issue5689> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5689] Support xz compression in tarfile module
Lars Gustäbel added the comment: Thanks for the review, guys! I can't close this issue yet because it depends on #6715. -- resolution: -> fixed stage: needs patch -> committed/rejected ___ Python tracker <http://bugs.python.org/issue5689> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5689] Support xz compression in tarfile module
Lars Gustäbel added the comment: Please, go ahead! -- ___ Python tracker <http://bugs.python.org/issue5689> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11638] python setup.py sdist --formats tar* crashes if version is unicode
Lars Gustäbel added the comment: Is there a good reason why the tarfile mode that is used is "w|gz"? It seems to me that this is not necessary, "w:gz" should be enough. "w|gz" is for special operations only (see the tarfile docs). -- nosy: +lars.gustaebel Added file: http://bugs.python.org/file24065/distutils_tarfile_fix.diff ___ Python tracker <http://bugs.python.org/issue11638> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13639] UnicodeDecodeError when creating tar.gz with unicode name
Lars Gustäbel added the comment: tarfile under Python 2.x is not particularly designed to support unicode filenames (the gzip module does not support them either), but that should not be too hard to fix. -- keywords: +patch Added file: http://bugs.python.org/file24066/tarfile-stream-gzip-unicode-fix.diff ___ Python tracker <http://bugs.python.org/issue13639> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11638] python setup.py sdist --formats tar* crashes if version is unicode
Lars Gustäbel added the comment: Just for the record: The gzip format (defined in RFC 1952) allows storing the original filename (without the .gz suffix) in an additional field in the header (the FNAME field). Latin-1 (iso-8859-1) is required. It is ironic that this causes so much trouble, because it is never used. A gzip file without that field is prefectly valid. The gzip program for example stores the original filename by default but does not use it when decompressing unless it is explicitly told to do so with the -N/--name option. If no FNAME field is present in a gzipped file the gzip program just falls back on stripping the .gz suffix. -- ___ Python tracker <http://bugs.python.org/issue11638> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13639] UnicodeDecodeError when creating tar.gz with unicode name
Lars Gustäbel added the comment: See http://bugs.python.org/issue11638#msg150029 -- ___ Python tracker <http://bugs.python.org/issue13639> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5689] Support xz compression in tarfile module
Lars Gustäbel added the comment: Wouldn't it be better then to use a default compresslevel of 6 in tarfile? I used level 9 in my patch without a particular reason, just because I thought 9 must be better than 6 ;-) -- Added file: http://bugs.python.org/file24084/lzma-preset.diff ___ Python tracker <http://bugs.python.org/issue5689> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5689] Support xz compression in tarfile module
Changes by Lars Gustäbel : Removed file: http://bugs.python.org/file24084/lzma-preset.diff ___ Python tracker <http://bugs.python.org/issue5689> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5689] Support xz compression in tarfile module
Lars Gustäbel added the comment: Yes, that's much better. Thanks for the tip. -- Added file: http://bugs.python.org/file24086/lzma-preset.diff ___ Python tracker <http://bugs.python.org/issue5689> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13639] UnicodeDecodeError when creating tar.gz with unicode name
Lars Gustäbel added the comment: I thought about that myself, too. It is clearly no new feature, it is really more some kind of a fix. Unicode pathnames given to tarfile.open() are just passed through to the open() function, which is why this always has been working, except for this particular case. There are 6 different possible write modes: "w:", "w:gz", "w:bz2", "w|", "w|gz" and "w|bz2". And the only one not working with a unicode pathname is "w|gz". Although admittedly tarfile.open() is not supposed to be used with a unicode path, people do it anyway, because they don't care, and because it works. The patch does not add a new broad functionality, it merely harmonises the way the six write modes work. Neither can we retroactively enforce using string pathnames at this point, nor should we let a user run into this strange error. The patch is very small and minimally invasive. The error message you get without the patch is completely incomprehensible. -- ___ Python tracker <http://bugs.python.org/issue13639> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13639] UnicodeDecodeError when creating tar.gz with unicode name
Lars Gustäbel added the comment: I think we should wrap this up as soon as possible, because it has already absorbed too much of our time. The issue we discuss here is a tiny glitch triggered by a corner-case. My original idea was to fix it in a minimal sort of way that is backwards-compatible. There are at least 4 different solutions now: 1. Keep the patch. 2. Revert the patch, leave everything as it was as wontfix. 3. Don't write an FNAME field at all if the filename that is passed is a unicode string. 4. Rewrite the FNAME code the way Terry suggests. This seems to me like the most complex solution, because we have to fix gzip.py as well, because the code in question was originally taken from the gzip module. (BTW, both the tarfile and gzip module discard the FNAME field when a file is opened for reading.) My favorites are 1 and 3 ;-) -- assignee: -> lars.gustaebel priority: normal -> low ___ Python tracker <http://bugs.python.org/issue13639> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13702] relative symlinks in tarfile.extract broken (windows)
Changes by Lars Gustäbel : -- assignee: -> lars.gustaebel nosy: +lars.gustaebel versions: +Python 3.3 ___ Python tracker <http://bugs.python.org/issue13702> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13702] relative symlinks in tarfile.extract broken (windows)
Lars Gustäbel added the comment: You actually hit two bugs at the same time here: The target of the created symlink was not translated from unix to windows path delimiters and is therefore broken. The second bug is issue12926 which leads to the error in TarFile.makefile(). Brian, AFAIK all file-specific functions on windows accept forward slashes in pathnames, right? Has this been discussed in the course of the windows implementation of os.symlink()? I could certainly fix the slash translation in tarfile.py, but may be it's os.symlink() that should been fixed. -- dependencies: +tarfile tarinfo.extract*() broken with symlinks ___ Python tracker <http://bugs.python.org/issue13702> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13702] relative symlinks in tarfile.extract broken (windows)
Lars Gustäbel added the comment: The dereference option is only used for archive creation, so the contents of the file a symbolic link is pointing to is added instead of the symbolic link itself. -- ___ Python tracker <http://bugs.python.org/issue13702> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12926] tarfile tarinfo.extract*() broken with symlinks
Lars Gustäbel added the comment: This should be fixed now, thanks. -- resolution: -> fixed stage: -> committed/rejected status: open -> closed versions: +Python 3.3 ___ Python tracker <http://bugs.python.org/issue12926> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12350] Improve stat_result.st_blocks and st_blksize documentation
New submission from Lars Wirzenius : Attached patch adds a few words to the os.stat documentation for the st_blocks and st_blksize fields to clarify them. -- assignee: docs@python components: Documentation files: stat_result.patch keywords: patch messages: 138467 nosy: docs@python, liw priority: normal severity: normal status: open title: Improve stat_result.st_blocks and st_blksize documentation versions: Python 2.7 Added file: http://bugs.python.org/file22382/stat_result.patch ___ Python tracker <http://bugs.python.org/issue12350> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12350] Improve stat_result.st_blocks and st_blksize documentation
Lars Wirzenius added the comment: http://pubs.opengroup.org/onlinepubs/009695399/basedefs/sys/stat.h.html says "A file system-specific preferred I/O block size for this object. In some file system types, this may vary from file to file.", which says essentially the same as the Linux stat(2) manpage from which I copied the extra words. The same page claims that st_blocks may use other units than 512 byte blocks, but that seems to be quite rare. GNU coreutils sources claim HP-UX and AIX PS/2 have non-512 blocks. Perhaps it would be better to indicate how to find out the block size? (Since st_blksize is not it, but that's an easy assumption to make.) -- ___ Python tracker <http://bugs.python.org/issue12350> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12350] Improve stat_result.st_blocks and st_blksize documentation
Lars Wirzenius added the comment: Right. So I guess at least the following should be changed (I'll make an actual patch once there's consensus): * st_blocks should say that the size of block is often 512 bytes, but that's not guaranteed, and there's no way to know for sure * st_blksize should say it is size of efficient I/O, and is unrelated to st_blocks Should there be something more? Ideally, there should be a way to find out the size of blocks for st_blocks, but I don't know how to figure that out (though probably code from GNU's du could be borrowed). -- ___ Python tracker <http://bugs.python.org/issue12350> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12841] Incorrect tarfile.py extraction
Lars Gustäbel added the comment: The patch is fine. Thank you very much for it, Sebastien. I think we have to go without a unit test. -- ___ Python tracker <http://bugs.python.org/issue12841> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12841] Incorrect tarfile.py extraction
Lars Gustäbel added the comment: Yes, it should be fixed in all affected branches. -- ___ Python tracker <http://bugs.python.org/issue12841> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12841] Incorrect tarfile.py extraction
Lars Gustäbel added the comment: Yes, I can do that as soon as I've managed to wrap my head around using Mercurial and the new way of developing Python. I have been away from Python programming for quite some time and haven't adapted my workflow yet. -- ___ Python tracker <http://bugs.python.org/issue12841> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11879] TarFile.chown: should use TarInfo.uid if user lookup fails
Lars Gustäbel added the comment: Issue #12841 is a duplicate of this one, but I give it precedence because it comes with a working patch. -- resolution: -> duplicate status: open -> closed versions: +Python 2.7, Python 3.3 ___ Python tracker <http://bugs.python.org/issue11879> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12841] Incorrect tarfile.py extraction
Changes by Lars Gustäbel : -- versions: +Python 2.7, Python 3.3 ___ Python tracker <http://bugs.python.org/issue12841> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12841] Incorrect tarfile.py extraction
Lars Gustäbel added the comment: Close as fixed. Thanks all! -- resolution: -> fixed stage: -> committed/rejected status: open -> closed ___ Python tracker <http://bugs.python.org/issue12841> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12841] Incorrect tarfile.py extraction
Lars Gustäbel added the comment: It's the low-level operating system aspects of tarfile that are very difficult to test, e.g. filesystem and operating system dependent features such as symbolic links, hard links, file permissions, ownership. It is not even possible to reliably determine the filesystem the testsuite currently runs on. Also, superuser privileges are needed for some operations to work, e.g. chown(). A testsuite is normally not run as root, so a test that depends on this will never get enough coverage. -- ___ Python tracker <http://bugs.python.org/issue12841> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12926] tarfile tarinfo.extract*() broken with symlinks
Changes by Lars Gustäbel : -- assignee: -> lars.gustaebel ___ Python tracker <http://bugs.python.org/issue12926> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue9065] tarfile: default root:root ownership is incorrect.
Lars Gustäbel added the comment: Fixed in r85211 (py3k), r85212 (release31-maint), r85213 (release27-maint). Thank you for the report. -- resolution: -> accepted stage: -> committed/rejected status: open -> closed type: -> behavior ___ Python tracker <http://bugs.python.org/issue9065> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10184] tarfile touches directories twice
Lars Gustäbel added the comment: The patch goes a bit too far. Though it solves this particular problem with the way TarFile.extractall() works, it breaks TarFile.extract(). Running the testsuite does not expose this, simply because there's no testcase :-( Please see the attached testcase I just wrote which illustrates the problem. -- nosy: +lars.gustaebel Added file: http://bugs.python.org/file19352/tarfile-extract-directory-test.diff ___ Python tracker <http://bugs.python.org/issue10184> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10184] tarfile touches directories twice
Lars Gustäbel added the comment: I'm not entirely happy with the name of the "touch" argument. Apart from it being nice and short, I think it's a little too unix-y and might be misleading because it is not only about setting the modification time as the name implies, but also owner and mode. My proposal would be "restore_attrs" or "set_attrs" which isn't half as nice as "touch", but sums up better what's actually done. I leave this up to you. I think the testcase wouldn't work on Windows the way it is now, would it? Apart from these minor issues the patch gets my blessing, go ahead ;-) -- ___ Python tracker <http://bugs.python.org/issue10184> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10261] tarfile iterator without members caching
Lars Gustäbel added the comment: I assume you're using Python 2.x. because tarfile's memory footprint was significantly reduced in Python 3.0, see the patch in issue2058 and r62337. This patch was not backported to the 2.x branch back then. As the 2.x branch has been closed for new features, this is not going to happen in the future. -- assignee: -> lars.gustaebel nosy: +lars.gustaebel ___ Python tracker <http://bugs.python.org/issue10261> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10292] tarinfo should use relative symlinks
Lars Gustäbel added the comment: Apparently you were in quite a hurry when you filed this bug report. - What is the exact problem and how does it manifest itself? - Any helpful details (tracebacks, output)? - Is there a testcase or example code you can provide? - Which other tools are you talking about? - Your patch is faulty (upperdir is undefined). - The patch does not apply to neither r27 tag nor release27-maint branch although you claim that 2.7 is affected. - Which exact Python version are you using on what platform? - Oh, and a little more politeness wouldn't hurt. Okay, so please ask again. I am curious to see what you've found. -- ___ Python tracker <http://bugs.python.org/issue10292> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10369] tarfile requires an actual file on disc; a file-like object is insufficient
Lars Gustäbel added the comment: Hm, why don't you just do this: with stat_tarfile.open(fileobj = sys.stdout, mode = "w|") as tar: for number in xrange(100): fileobj = generate_file_content(number) tarinfo = tar.gettarinfo(fileobj=open("/etc/passwd")) tarinfo.name = 'file-%d.txt' % number tarinfo.size = len(str(number)) * 100 tarinfo.uid = 1000 tarinfo.gid = 1000 tarinfo.uname = "dstromberg" tarinfo.gname = "dstromberg" tar.addfile(tarinfo, fileobj) Wouldn't that work, too? Or am I missing something? -- assignee: -> lars.gustaebel nosy: +lars.gustaebel ___ Python tracker <http://bugs.python.org/issue10369> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10436] tarfile.extractfile in "r|" stream mode fails with filenames or members from getmembers()
Lars Gustäbel added the comment: This behaviour is intentional. A tar archive does not contain a central directory structure, it is just a chain of files. As a side-effect it is possible to have multiple files with the same name in one archive, e.g. when append mode was used. That's why the archive must be scanned from the beginning to the end as soon as you reference an archive member by its name. The best way to deal with this issue in my opinion is to improve the documentation for the stream interface. -- assignee: -> lars.gustaebel nosy: +lars.gustaebel priority: normal -> low versions: +Python 3.1, Python 3.2, Python 3.3 ___ Python tracker <http://bugs.python.org/issue10436> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11224] 3.2: tarfile.getmembers causes 100% cpu usage on Windows
Lars Gustäbel added the comment: Thanks for your great report. This is fixed now in r88528 (py3k) and r88529 (release32-maint). -- keywords: +3.2regression resolution: -> accepted stage: -> committed/rejected status: open -> closed versions: +Python 3.3 ___ Python tracker <http://bugs.python.org/issue11224> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11449] tarfile tries to file_.tell() even when creating a new archive
Lars Gustäbel added the comment: If I understand correctly, the solution to your problem would be to use the stream mode "w|" instead of "w". Could you please try that? -- assignee: -> lars.gustaebel ___ Python tracker <http://bugs.python.org/issue11449> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11457] Expose nanosecond precision from system calls
Changes by Lars Gustäbel : -- nosy: +lars.gustaebel ___ Python tracker <http://bugs.python.org/issue11457> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7101] tarfile: OSError with TarFile.add(..., recursive=True) about non-existing file
Lars Gustäbel added the comment: I kept this issue open, because I have not yet come to a decision. I don't think the current behaviour is a bug, but these kinds of errors could be handled more intelligently. For example, errors during extraction can be hidden depending on the TarFile.errorlevel attribute. Something similar could be done for creation of archives. What exactly, I don't know... I have not yet managed to make up my mind. -- ___ Python tracker <http://bugs.python.org/issue7101> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11787] File handle leak in TarFile lib
Changes by Lars Gustäbel : -- assignee: -> lars.gustaebel ___ Python tracker <http://bugs.python.org/issue11787> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11899] TarFile.gettarinfo modifies self.inodes
Lars Gustäbel added the comment: Good point. Do you happen to have a working implementation already? -- assignee: -> lars.gustaebel priority: normal -> low versions: +Python 3.3 -Python 3.2 ___ Python tracker <http://bugs.python.org/issue11899> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11879] TarFile.chown: should use TarInfo.uid if user lookup fails
Changes by Lars Gustäbel : -- assignee: -> lars.gustaebel priority: normal -> low ___ Python tracker <http://bugs.python.org/issue11879> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11971] Wrong parameter -O0 instead of -OO in manpage
New submission from Lars Michelsen : Hello Devs, digging around in the python manpage and playing with the parameters I found a wrong parameter specification in the python manpage. The -OO parameter for discarding docstrings is written as -O0 (the 2nd is a zero). A patch is attached. Regards -- assignee: docs@python components: Documentation files: fix-OO-param.patch keywords: patch messages: 134909 nosy: docs@python, lm priority: normal severity: normal status: open title: Wrong parameter -O0 instead of -OO in manpage versions: Python 2.7, Python 3.1, Python 3.2, Python 3.3, Python 3.4 Added file: http://bugs.python.org/file21846/fix-OO-param.patch ___ Python tracker <http://bugs.python.org/issue11971> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10292] tarinfo should use relative symlinks
Lars Gustäbel added the comment: Okay, this bug has been fixed in the 2.7 series. Python 2.6 is now in security-fix-only mode which means that there will not be a fix for it. Therefore, I close this issue. -- resolution: -> fixed status: open -> closed versions: +Python 2.6 -Python 2.7 ___ Python tracker <http://bugs.python.org/issue10292> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10261] tarfile iterator without members caching
Lars Gustäbel added the comment: There is no trivial or backwards-compatible solution to this problem. The way it is now, there is no alternative to storing all TarInfo objects: there is no central table of contents in an archive we could use, so we must create our own. In other words, tarfile does not "burn" memory without a reason. The problem you encounter is somehow a corner case, fortunately with a simple workaround: for tarinfo in tar: ... tar.members = [] There are two things that I will clearly refuse to do. One thing is to add yet another option to the TarFile class to switch off caching as this would make many TarFile methods dysfunctional without the user knowing why. The other thing is to add an extra non-caching Iterator class. Sorry, that I have nothing more to offer. Maybe, someone else comes up with a brilliant idea. -- ___ Python tracker <http://bugs.python.org/issue10261> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10760] tarfile doesn't handle sysfs well
Changes by Lars Gustäbel : -- assignee: -> lars.gustaebel components: +Library (Lib) -None nosy: +lars.gustaebel ___ Python tracker <http://bugs.python.org/issue10760> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10761] tarfile.extractall fails to overwrite symlinks
Changes by Lars Gustäbel : -- assignee: -> lars.gustaebel nosy: +lars.gustaebel ___ Python tracker <http://bugs.python.org/issue10761> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11224] 3.2: tarfile.getmembers causes 100% cpu usage on Windows
Lars Gustäbel added the comment: _FileInFile.read() does lots of unnecessary seeking and reads the same block again and again. The attached patch fixes that. Please try if it works for you. -- assignee: -> lars.gustaebel keywords: +patch Added file: http://bugs.python.org/file20793/tarfile.diff ___ Python tracker <http://bugs.python.org/issue11224> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3016] tarfile.py incurs exception from self.chmod() when tarball has g+s
Lars Gustäbel <[EMAIL PROTECTED]> added the comment: With some effort I could reproduce the problem (on a FAT32 filesystem), but what we have here is clearly a usage problem. In unpack_tarfile() in setuptools/archive_util.py TarFile's internal _extract_member() method is used to extract the contents. For every non-fatal error (like a failing chmod()) _extract_member() raises an ExtractError exception. In TarFile.extract() these ExtractErrors are normally ignored. The unpack_tarfile() function in setuptools needs some fixing, it should either act more like TarFile.extract() or better use the public API. -- resolution: -> works for me status: open -> closed ___ Python tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue3016> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3039] tarfile.TarFileCompat.writestr(ZipInfo, str) raises AttributeError
Changes by Lars Gustäbel <[EMAIL PROTECTED]>: -- assignee: -> lars.gustaebel nosy: +lars.gustaebel ___ Python tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue3039> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3039] tarfile.TarFileCompat.writestr(ZipInfo, str) raises AttributeError
Lars Gustäbel <[EMAIL PROTECTED]> added the comment: Thank you very much for your patch, I applied it to the trunk as r65402. However, I decided to remove the TarFileCompat class from the Python 3.0 branch, see http://mail.python.org/pipermail/python-3000/2008-July/014448.html for details. -- resolution: -> accepted status: open -> closed ___ Python tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue3039> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3830] Tarfile has incorrect USTAR "VERSION" field (should be 00; is 0 NUL)
Lars Gustäbel <[EMAIL PROTECTED]> added the comment: This problem existed only in the first 2.5 release. -- resolution: -> works for me status: open -> closed ___ Python tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue3830> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3838] test_tarfile error on cygwin (Directory not empty)
Lars Gustäbel <[EMAIL PROTECTED]> added the comment: The patch is okay. Go ahead. BTW, I've never used Cygwin before, is it always that slow? 10 minutes for a configure script on a recent machine is a real pain. -- nosy: +lars.gustaebel ___ Python tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue3838> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7693] tarfile.extractall can't have unicode extraction path
Lars Gustäbel added the comment: So, use the pax format. It stores the filenames as utf-8 and this way you will be on the safe side. I hope we both agree that the solution to your particular problem is nothing tarfile.py can provide. So, I am going to close this issue now. -- resolution: -> works for me status: open -> closed ___ Python tracker <http://bugs.python.org/issue7693> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7232] Support of 'with' statement fo TarFile class
Lars Gustäbel added the comment: Another version of the patch (issue7232.6.diff) that checks if the TarFile object is still open in the __enter__() method (plus a test for that). I removed the docstrings as Eric suggested. This is common practice in the standard library. -- Added file: http://bugs.python.org/file16396/issue7232.6.diff ___ Python tracker <http://bugs.python.org/issue7232> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7232] Support of 'with' statement fo TarFile class
Lars Gustäbel added the comment: IMO it is okay for __enter__() and __exit__() not to have docstrings. I cannot see what's so special about the behaviour of __enter__() and __exit__(). __enter__() raises IOError only if the TarFile object has been already closed. This is exactly the behaviour I would expect, because it is the same every other TarFile method does when the object has been closed. IOW, using a closed TarFile as a context manager is the programmer's mistake, and I don't feel the need to document that case. The fact that __exit__() only closes the TarFile object and does not swallow exceptions is what everyone expects from a "file object". It is the only logical thing to do, no need to document that either. The test_context_manager_exception() test is fine. If the call to tarfile.open() really raises an exception then something is so terribly wrong and probably all of the testsuite's 200 tests will fail anyway. We can safely assume here that this will work, no need to double-check. However, I have changed the docs again to be a bit more specific. -- Added file: http://bugs.python.org/file16400/issue7232.8.diff ___ Python tracker <http://bugs.python.org/issue7232> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7232] Support of 'with' statement fo TarFile class
Lars Gustäbel added the comment: I found an issue that needs to be addressed: if there is an error while the TarFile object is opened for writing, we cannot simply call TarFile.close() in the __exit__() method. close() would try to finalize the archive, i.e. write two zero end-of-archive blocks and a number of padding blocks. I changed __exit__() to call close() only if everything went fine. If there was an exception only the most basic cleanup is done. I added more tests and adapted the docs. -- Added file: http://bugs.python.org/file16401/issue7232.9.diff ___ Python tracker <http://bugs.python.org/issue7232> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7232] Support of 'with' statement fo TarFile class
Lars Gustäbel added the comment: Okay, it is done, see r78623 (trunk) and r78626 (py3k). Thanks to all for your work and support! -- resolution: -> accepted status: open -> closed ___ Python tracker <http://bugs.python.org/issue7232> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7232] Support of 'with' statement fo TarFile class
Changes by Lars Gustäbel : -- stage: patch review -> committed/rejected ___ Python tracker <http://bugs.python.org/issue7232> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue8464] tarfile creates tarballs with execute permissions set
Lars Gustäbel added the comment: 0666 is the right mode and the patch is correct. @Tarek: Why does shutil.make_archive() use mode "w|..." instead of "w:..."? IMHO that is not necessary, because it works on regular files only. -- ___ Python tracker <http://bugs.python.org/issue8464> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue8464] tarfile creates tarballs with execute permissions set
Lars Gustäbel added the comment: I applied the patch and added a test case (see r80616-r80619). Thanks for the report. -- resolution: -> accepted stage: -> committed/rejected status: open -> closed ___ Python tracker <http://bugs.python.org/issue8464> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue8390] tarfile: use surrogates for undecode fields
Lars Gustäbel added the comment: Yes, I will soon have ;-) Please give me a few days... -- ___ Python tracker <http://bugs.python.org/issue8390> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue8390] tarfile: use surrogates for undecode fields
Lars Gustäbel added the comment: I think it is a good suggestion to use "surrogateescape" as the default, because (I hope) it produces the fewest errors and is the best choice if tarfile is used in connection with Python's filesystem calls. - When reading tar headers, undecodable chars in filenames end up as surrogates. This way no information is lost. In principle tarfile is merely a gateway to a filesystem inside an archive, so it feels natural if it treats filenames the same as Python's filesystem calls. - When writing tar headers, filenames with surrogate chars (e.g. from os.listdir()) will be converted back to bytes in the header (in case of gnu and ustar formats). Filenames will remain unchanged, this is exactly as one would expect. - When writing pax headers, filenames with surrogates will raise a UnicodeError because we may only use strict utf-8 inside a pax header. This is actually no difference to the status quo. @Martin: As I understand it, the pax "invalid"-option is supposed to deal with the case when strings from a pax header are not representable in the user's encoding. In tarfile's case we don't have this problem when reading the archive until we try to extract it. Unfortunately, POSIX says nothing about how to store bad filenames in a pax archive. tarfile raises an error. GNU tar fails silently, it just puts the unchanged original filename into the pax header without converting it to utf-8, thus violating the standard. -- Added file: http://bugs.python.org/file17227/tarfile_surrogates.2.diff ___ Python tracker <http://bugs.python.org/issue8390> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue8633] tarfile doesn't support undecodable filename in PAX format
Lars Gustäbel added the comment: Victor, you misunderstood the pax definition, but it wouldn't harm tarfile if it knew how to handle these corrupt GNU tar archives. In the meantime I filed a bug report on bug-...@gnu.org for this. I said in msg105085 that POSIX gives no advice on how to handle broken filename encodings, but it does in POSIX:2008. libarchive (bsdtar) uses the way that is described there. The solution is to use a field called "hdrcharset". See http://www.opengroup.org/onlinepubs/9699919799/utilities/pax.html -- ___ Python tracker <http://bugs.python.org/issue8633> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue8633] tarfile doesn't support undecodable filename in PAX format
Lars Gustäbel added the comment: I am currently working on a patch to let tarfile use the hdrcharset field. -- ___ Python tracker <http://bugs.python.org/issue8633> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue8701] tarfile: first character of member names doubled
Changes by Lars Gustäbel : -- assignee: -> lars.gustaebel nosy: +lars.gustaebel ___ Python tracker <http://bugs.python.org/issue8701> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue8701] tarfile: first character of member names doubled
Lars Gustäbel added the comment: Unfortunately, I cannot reproduce your problem and ask you to please provide more information. Would it be possible to attach the output or a screenshot depicting the problem? Which operating system/distribution do you use? Have you double-checked your testing conditions? -- ___ Python tracker <http://bugs.python.org/issue8701> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1531] tarfile.open(fileobj=f) and bad metadata of the first file within the archive
Changes by Lars Gustäbel: -- assignee: -> lars.gustaebel nosy: +lars.gustaebel __ Tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue1531> __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1531] tarfile.open(fileobj=f) and bad metadata of the first file within the archive
Lars Gustäbel added the comment: I fixed this in the trunk (r59260) and release25-maint branch (r59261). Thanks for the report. If you cannot wait for the next release, I recommend you use mode "r|" as a workaround. BTW, 756 is absolutely no realistic example value for the position of the second member. A header block must start on a 512-byte boundary. -- resolution: -> fixed status: open -> closed __ Tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue1531> __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1735] tarfile.TarFile.extractall not setting directory permissions correctly
Changes by Lars Gustäbel: -- assignee: -> lars.gustaebel nosy: +lars.gustaebel __ Tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue1735> __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1735] tarfile.TarFile.extractall not setting directory permissions correctly
Lars Gustäbel added the comment: Committed to release25-maint branch as r59713. -- resolution: -> accepted status: open -> closed __ Tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue1735> __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1735] tarfile.TarFile.extractall not setting directory permissions correctly
Lars Gustäbel added the comment: Thanks for the patch. I added a testcase and applied it to the trunk, see r59712. Python 2.5 follows later on. __ Tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue1735> __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1527974] tarfile chokes on ipython archive on Windows
Lars Gustäbel added the comment: I close this issue because it is out of date. The new TarFile.extractall() method in Python 2.5 provides a way to solve the original problem IMO. -- resolution: -> out of date status: open -> closed _ Tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue1527974> _ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1540385] tarfile __slots__ addition
Lars Gustäbel added the comment: I close this issue as there has been no progress over the last 1.5 year. -- resolution: -> rejected status: open -> closed _ Tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue1540385> _ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1886] Permit to easily use distutils "--formats=tar, gztar, bztar" on all systems
Lars Gustäbel added the comment: I just did some tests and could not find any major difference. Which are the differences you found? BTW, in your patch the "dist" directory is not automatically created. __ Tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue1886> __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1886] Permit to easily use distutils "--formats=tar, gztar, bztar" on all systems
Lars Gustäbel added the comment: Hm, on my Linux box both files are more or less identical. Sorry, I cannot reproduce your problem. __ Tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue1886> __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com