Bugs item #1704156, was opened at 2007-04-20 01:21 Message generated for change (Comment added) made by dvusboy You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1704156&group_id=5470
Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Python Library Group: Python 2.5 Status: Open Resolution: None Priority: 5 Private: No Submitted By: K. C. Wong (dvusboy) Assigned to: Lars Gustäbel (gustaebel) Summary: TarFile.addfile() throws a struct.error Initial Comment: When adding a file to a TarFile instance using addfile(), if the file paths (name and arcname) are unicode strings, then a struct.error will the raised. Python versions prior to 2.5 do not show this behaviour. Assuming the current directory has a file name 'mac.txt', here is an interactive session that shows the problem: Python 2.5 (r25:51908, Apr 18 2007, 19:06:57) [GCC 3.4.6 20060404 (Red Hat 3.4.6-3)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import tarfile >>> t=tarfile.open('test.tar', 'w') >>> i=t.gettarinfo(u'mac.txt', u'mac.txt') >>> t.addfile(i, file(u'mac.txt', 'r')) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/lib/python2.5/tarfile.py", line 1422, in addfile self.fileobj.write(tarinfo.tobuf(self.posix)) File "/usr/lib/python2.5/tarfile.py", line 871, in tobuf buf = struct.pack("%ds" % BLOCKSIZE, "".join(parts)) File "/usr/lib/python2.5/struct.py", line 63, in pack return o.pack(*args) struct.error: argument for 's' must be a string ---------------------------------------------------------------------- >Comment By: K. C. Wong (dvusboy) Date: 2007-04-20 21:54 Message: Logged In: YES user_id=414175 Originator: YES I see the work around, and I have already implemented similar workarounds in my code. However, I have 2 problem with this response: 1. The behaviour changed. As the documentation did not explicitly say tarfile does not support unicode file paths, and it worked prior to 2.5, then I would say breaking that behaviour at the least calls for a documentation update. 2. The error message stamps from struct failing to pack a unicode string. First of all, I did not grasp the subtle message of it being a unicode string as opposed to a non-unicode string. You see, I actually did not expect unicode string in the first place, it was a by-product of TEXT_DATA from a DOM tree. I can understand why struct.pack() throws (because no explicit encoding scheme was specified) but it was so cryptic with regard to tarfile itself, that I had to modify tarfile to track down the reason for the exception. In short, I would prefer the owner of tarfile to make an explicit support or not-supported call on unicode file path, document said decision and make more reasonable attempt in presenting releavant exceptions. Thank you for looking into this. ---------------------------------------------------------------------- Comment By: Lars Gustäbel (gustaebel) Date: 2007-04-20 12:25 Message: Logged In: YES user_id=642936 Originator: NO tarfile.py was never guaranteed to work correctly with unicode filenames. The fact that this works with Python < 2.5 is purely accidental. You can work around this (sticking to your example): i = t.gettarinfo(u'mac.txt', 'mac.txt') or: i = t.gettarinfo('mac.txt') ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1704156&group_id=5470 _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com