Il giorno gio 31 ott 2024 alle ore 08:09 Sergey Poznyakoff
<g...@gnu.org.ua> ha scritto:
>
> Hi Mateo,
>
> > For some weird reason this works when extracting but not when listing
> > archive content:
>
> There is a considerable difference between the two operations.  When
> listing an archive, tar attempts to seek in it, in order to skip file
> content, something it doesn't do when extracting.  If lseek returns
> -1, the archive is marked as non-seekable and further skipping is done
> by reading and discarding contents (that's normally what happens when
> reading archive from stdin).  However, if lseek returns a positive
> value, that value must be divisible by record size.  You'll get this
> diagnostics if it is not:
>
> > tar: rmtlseek not stopped at a record boundary
>
> I tried both operations on an arbitrarily selected deb file, and didn't
> encounter any problems.  However, I was using git HEAD (I haven't had
> the time to inspect your patches yet, sorry).  Can something in your
> changess cause such behavior?  FWIW, the main question is actually: why
> lseek succeeded when reading from stdin?
>
> Regards,
> Sergey
>

Hi Sergey,

let's put aside my patches for now, I'm using the stock tar supplied by Debian.
The problem only happens when the archive is uncompressed, probably
due to the way tar handles the input from a subprocess.
This is what I'm doing:

~/src/tar$ tar -c lib -f lib.tar

~/src/tar$ dd bs=4k count=1 if=/dev/zero of=pad
1+0 records in
1+0 records out
4096 bytes (4.1 kB, 4.0 KiB) copied, 0.000108667 s, 37.7 MB/s

~/src/tar$ cat pad lib.tar >lib-offset.tar

~/src/tar$ ll pad *.tar
-rw-r--r--. 1 teknoraver teknoraver  94K Oct 31 14:07 lib-offset.tar
-rw-r--r--. 1 teknoraver teknoraver  90K Oct 31 14:07 lib.tar
-rw-r--r--. 1 teknoraver teknoraver 4.0K Oct 31 14:07 pad

~/src/tar$ (dd status=none ibs=4k skip=1 count=0 && tar t) <lib-offset.tar
lib/
lib/.gitignore
lib/Makefile.am
lib/attr-xattr.in.h
lib/wordsplit.c
tar: rmtlseek not stopped at a record boundary
tar: Error is not recoverable: exiting now

I think that the problem is the fact that at the end of the listing,
tar assumes that the current pos is a multiple of 10240 (default
record_size).
The problem disappears if I save the starting position on archive
open, and use it in the calculations later, like:

--- a/src/buffer.c
+++ b/src/buffer.c
@@ -41,6 +41,8 @@
 /* Number of retries before giving up on read.  */
 enum { READ_ERROR_MAX = 10 };

+static off_t starting_offset;
+
 /* Variables.  */

 static tarlong prev_written;    /* bytes written on previous volumes */
@@ -780,6 +782,8 @@ _open_archive (enum access_mode wanted_access)
             enum compress_type type;

             archive = STDIN_FILENO;
+           starting_offset = lseek (archive, 0, SEEK_CUR);
+
             type = check_compressed_archive (&shortfile);
             if (type != ct_tar && type != ct_none)
              paxfatal (0, _("Archive is compressed. Use %s option"),
@@ -1096,6 +1100,7 @@ seek_archive (off_t size)
   if (offset < 0)
     return offset;

+  offset -= starting_offset;
   if (offset % record_size)
     paxfatal (0, _("rmtlseek not stopped at a record boundary"));

Regards,
-- 
Matteo Croce

perl -e 'for($t=0;;$t++){print chr($t*($t>>8|$t>>13)&255)}' |aplay

Reply via email to