Eduardo Robles Elvira added the comment: >> [...] but remember, we split a volume only in the middle of a big file, not >> in any other case (AFAIK). Hopefully you don't get huge pax headers or >> anything strange. [...] > Hopefully? Sorry, but have you tested this? I did. I let GNU tar create a two > volume archive that is split exactly between the two blocks of an XHDTYPE pax > header. > > The result is terrifying. At the beginning of the second volume GNU tar > creates an XGLTYPE header as the pax replacement for a GNUTYPE_MULTIVOL > header, followed by an XHDTYPE header ("GNUFileParts") that somehow decorates > the following REGTYPE(!) tar header that contains the continuation of the > split XHDTYPE header data from the previous volume. After that comes the > REGTYPE file that the split XHDTYPE header was actually meant for as > decoration. > > I attached the archive to this issue. > > What happens if a GNUTYPE_LONGNAME header is split in two? I don't wanna > know... > >> write() will need to take into account blocks (BLOCKSIZE), just to be able >> to split the volumes correctly. > > It is mandatory to do the split on a block boundary (a multiple of 512). >>> BTW, my version of GNU tar refuses to create compressed multiple-volume >>> archives which is why I doubt the usefulness of this feature overall. >> But it has multivolume support right? Which is what I am proposing here. >> Also, you can gzip (or encrypt or anything) the volumes after creating the >> volumes.. > > Yeah, it has multivolume support, but a very limited one that is not only > weird but isn't even usable together with compression. And sure, I can > compress and encrypt the volumes afterward, but I can also create a > compressed archive and pipe it through split(1) to split it into parts. Both > ways create tar archives that are not readable by GNU tar because they're > non-standard. So what? > > Please tell me, what is your actual personal use-case for this feature?
I'm willing modify the patch to remove the "weirdness" you refer to. I differ on that it's not usable: it might not be useful to you, but it's certainly a feature that covers part of the functionality of GNU tar. Actually, some of the unit tests are like this: use GNU Tar to compress, then extract with tarfile - and viceversa. Of course you can use split. And I could use Ruby or Perl, but I'm using python and tarfile, and this is a GNU tar feature that is just not supported in python tarfile upstream, and I'm just trying to contribute this feature, if possible :-). BTW, If I create a multivol tar file and then compress the volumes, that does not make it "non-standard", in the same way that if I create a PNG file and then compress it and then store it in EXTFS, it doesn't make it non-standard. I'm just using multiple layers of standards. I'm a contractor, and I have been asked by a client to develop a python-based backup tool. The client is technical and had already an idea of what he wanted to do: use python-tarfile and add support to multivolume and some other goodies, and the client also wanted to try to push the changes upstream as we believe it is the correct thing to do. BTW, when we designed the backup tool, we ruled out the possibility of using split because split wouldn't allow to correctly list all the files in each file-slice separately. We wanted to be able to recover all the files of each "volume" so that if we lose other volumes, we can still recover all the data from the volumes we have. Anyway, if you are the maintainer of tarfile and you think it's not possible to push tar-multivolume support upstream in python tarfile for whatever reason, please tell me. ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue18321> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com