Eduardo Robles Elvira added the comment:

>> [...] but remember, we split a volume only in the middle of a big file, not 
>> in any other case (AFAIK). Hopefully you don't get huge pax headers or 
>> anything strange. [...] 
> Hopefully? Sorry, but have you tested this? I did. I let GNU tar create a two 
> volume archive that is split exactly between the two blocks of an XHDTYPE pax 
> header.
>
> The result is terrifying. At the beginning of the second volume GNU tar 
> creates an XGLTYPE header as the pax replacement for a GNUTYPE_MULTIVOL 
> header, followed by an XHDTYPE header ("GNUFileParts") that somehow decorates 
> the following REGTYPE(!) tar header that contains the continuation of the 
> split XHDTYPE header data from the previous volume. After that comes the 
> REGTYPE file that the split XHDTYPE header was actually meant for as 
> decoration.
>
> I attached the archive to this issue.
>
> What happens if a GNUTYPE_LONGNAME header is split in two? I don't wanna 
> know...
>
>> write() will need to take into account blocks (BLOCKSIZE), just to be able 
>> to split the volumes correctly.
>
> It is mandatory to do the split on a block boundary (a multiple of 512).
>>> BTW, my version of GNU tar refuses to create compressed multiple-volume 
>>> archives which is why I doubt the usefulness of this feature overall.
>> But it has multivolume support right? Which is what I am proposing here. 
>> Also, you can gzip (or encrypt or anything) the volumes after creating the 
>> volumes..
>
> Yeah, it has multivolume support, but a very limited one that is not only 
> weird but isn't even usable together with compression. And sure, I can 
> compress and encrypt the volumes afterward, but I can also create a 
> compressed archive and pipe it through split(1) to split it into parts. Both 
> ways create tar archives that are not readable by GNU tar because they're 
> non-standard. So what?
>
> Please tell me, what is your actual personal use-case for this feature?

I'm willing modify the patch to remove the "weirdness" you refer to. I differ 
on that it's not usable: it might not be useful to you, but it's certainly a 
feature that covers part of the functionality of GNU tar. Actually, some of the 
unit tests are like this: use GNU Tar to compress, then extract with tarfile - 
and viceversa.

Of course you can use split. And I could use Ruby or Perl, but I'm using python 
and tarfile, and this is a GNU tar feature that is just not supported in python 
tarfile upstream, and I'm just trying to contribute this feature, if possible 
:-).

BTW, If I create a multivol tar file and then compress the volumes, that does 
not make it "non-standard", in the same way that if I create a PNG file and 
then compress it and then store it in EXTFS, it doesn't make it non-standard. 
I'm just using multiple layers of standards.

I'm a contractor, and I have been asked by a client to develop a python-based 
backup tool. The client is technical and had already an idea of what he wanted 
to do: use python-tarfile and add support to multivolume and some other 
goodies, and the client also wanted to try to push the changes upstream as we 
believe it is the correct thing to do.

BTW, when we designed the backup tool, we ruled out the possibility of using 
split because split wouldn't allow to correctly list all the files in each 
file-slice separately. We wanted to be able to recover all the files of each 
"volume" so that if we lose other volumes, we can still recover all the data 
from the volumes we have. 

Anyway, if you are the maintainer of tarfile and you think it's not possible to 
push tar-multivolume support upstream in python tarfile for whatever reason, 
please tell me.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue18321>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to