Hello, any updates here? Is this a real issue or a misunderstanding?
Thank you. Regards, Ondrej On Tue, Mar 30, 2021 at 4:09 PM Ondrej Dubaj <odu...@redhat.com> wrote: > gentle ping > > On Fri, Mar 19, 2021 at 12:22 PM Ondrej Dubaj <odu...@redhat.com> wrote: > >> Hello, >> any update here? >> >> Thanks. >> Ondrej >> >> On Mon, Mar 1, 2021 at 11:05 AM Ondrej Dubaj <odu...@redhat.com> wrote: >> >>> Ping, any updates here? >>> >>> Thanks. >>> >>> On Mon, Feb 15, 2021 at 5:07 PM Ondrej Dubaj <odu...@redhat.com> wrote: >>> >>>> Gentle ping >>>> >>>> On Mon, Jan 18, 2021 at 12:02 PM Ondrej Dubaj <odu...@redhat.com> >>>> wrote: >>>> >>>>> One of the customer faced I/O errors while archiving a huge file 11 TB >>>>> and observed that after Tar had hit read I/O error due to xfs filesystem, >>>>> it still continue writing 0's to the file using strace. However there was >>>>> no indication for tar that it was writing 0's when the error occurred. >>>>> >>>>> Later it was found that it is expected behavior to write 0's as the file >>>>> header is already written. Hence, it need to be padded with 0's. >>>>> >>>>> Using the reproducing steps provided by customer, we can see this >>>>> behavior. >>>>> >>>>> Padding 0's is expected behavior however it does so silently (for Read >>>>> error at byte...), it should say it is Padding with zeros similar to how >>>>> it reports "File Shrank , padding with zeroes" >>>>> >>>>> During the reproducer steps provided by customer we see that sometimes >>>>> tar report "Read I/O errors" as "File shrank, padding with 0" , we see in >>>>> the step(2) provided. >>>>> >>>>> Reproducer available here: >>>>> >>>>> #!/bin/bash >>>>> # Reproducer "tardust" >>>>> # >>>>> # When "tar create" reads a file there are several shortcomings when it >>>>> hits read error >>>>> # >>>>> # 1) When read() returns 0 bytes due to read error, then this happens >>>>> # read(4, >>>>> "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., >>>>> 3584) = 3584 >>>>> # write(3, >>>>> "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., >>>>> 3584) = 3584 >>>>> # read(4, 0x563adef7b000, 3584) = -1 EIO (Input/output error) >>>>> # write(2, "tar: ", 5tar: ) = 5 >>>>> # write(2, "/mntx/testfile: Read error at by"..., 70/mntx/testfile: Read >>>>> error at byte 260653056, while reading 3584 bytes) = 70 >>>>> # write(2, ": Input/output error", 20: Input/output error) = 20 >>>>> # write(3, >>>>> "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., >>>>> 3584) = 3584 >>>>> # write(3, >>>>> "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., >>>>> 3584) = 3584 >>>>> # write(3, >>>>> "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., >>>>> 3584) = 3584 >>>>> # Actual behaviour: it prints a message about "Read error", but it >>>>> conceals the information it will pad the output with zeros >>>>> # Expected behaviour: it should also print the information "padding with >>>>> zero" >>>>> # 2) There is a 2nd shortcoming about tar not differentiate between "read >>>>> error" and "file shrinkage" >>>>> # That means when it sees a short read due to read error, it does not >>>>> report read error. >>>>> # It looks like this: >>>>> # read(4, >>>>> "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., >>>>> 3584) = 3584 >>>>> # write(3, >>>>> "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., >>>>> 3584) = 3584 >>>>> # read(4, >>>>> "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., >>>>> 3584) = 2560 <<< HERE >>>>> # write(2, "tar: ", 5tar: ) = 5 >>>>> # write(2, "/mntx/testfile: File shrank by 5"..., 65/mntx/testfile: File >>>>> shrank by 53927936 bytes; padding with zeros) = 65 >>>>> # write(2, "\n", 1 >>>>> # ) = 1 >>>>> # write(3, >>>>> "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., >>>>> 3584) = 3584 >>>>> # write(3, >>>>> "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., >>>>> 3584) = 3584 >>>>> # Summary: A read error is not reported here. At least it now says >>>>> "padding with zeros" >>>>> # Expected behaviour: it should report a read error, so the user knows >>>>> what it going on. >>>>> # >>>>> # 3) Side-Note: >>>>> # The blocking factor is applied to the output. When reading a file, all >>>>> reads are misaligned by 512 bytes. >>>>> # This is because it writes a 512 header for every archived file. >>>>> # That means the first read from file is 512bytes too short: >>>>> # Running with tar-blocking-factor=7 >>>>> # fstat(1, {st_mode=S_IFREG|0644, st_size=17827, ...}) = 0 >>>>> # write(1, "/mntx/testfile\n", 15/mntx/testfile >>>>> # ) = 15 >>>>> # read(4, >>>>> "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., >>>>> 3072) = 3072 #1st read 512bytes too short >>>>> # write(3, "mntx/testfile\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., >>>>> 3584) = 3584 >>>>> # read(4, >>>>> "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., >>>>> 3584) = 3584 >>>>> # write(3, >>>>> "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., >>>>> 3584) = 3584 >>>>> # >>>>> # 4) Reproducer overview: >>>>> # - Create a 500MB testimage, then create a testfile1 in the image >>>>> # - Use losetup/dmsetup with the "dust" target type >>>>> # - you can inject IO errors at specified block number in "dust" >>>>> # - You must hit a 4K boundary to see EIO, so use tar-blocking-factor=7 >>>>> and >>>>> # - vary the bad blocknumber to find the case (1) >>>>> echo Step 1 Create disk image >>>>> dd if=/dev/zero of=/tmp/testimage bs=1M count=500 || exit >>>>> echo Step 2 Create XFS in image >>>>> mkfs.xfs /tmp/testimage || exit >>>>> echo Step 3 Use losetup so the file can be used a block device >>>>> losetup /dev/loop1 /tmp/testimage || exit >>>>> losetup >>>>> echo Step 6 Now create the testfile, this will have read error injected >>>>> later >>>>> mkdir /mntx >>>>> mount /dev/loop1 /mntx || exit >>>>> dd if=/dev/zero of=/mntx/testfile bs=1M count=300 || exit >>>>> umount /mntx >>>>> echo Step7 Now iterating through bad blocks >>>>> echo As result, there are strace output file a1000 ... a1040 >>>>> for i in `seq 1000 1 1040` >>>>> do >>>>> echo >>>>> echo Badblock $i >>>>> let ERR=i >>>>> let ERR1=i+1 >>>>> let NUMSECTOR2=1024000-ERR1 >>>>> #echo ERR1 is $ERR1 >>>>> #echo NUMSECTOR2 is $NUMSECTOR2 >>>>> dmsetup create tardust <<EOF >>>>> 0 $ERR linear /dev/loop1 0 >>>>> $ERR 1 error >>>>> $ERR1 $NUMSECTOR2 linear /dev/loop1 $ERR1 >>>>> EOF >>>>> #dmsetup ls >>>>> #dmsetup status >>>>> #dmsetup table >>>>> mount /dev/mapper/tardust /mntx || exit >>>>> strace tar cvbf 7 /tmp/tardust.tar /mntx/testfile >&/tmp/a$i >>>>> umount /mntx >>>>> dmsetup remove tardust >>>>> grep -e error -e shrank /tmp/a$i >>>>> done >>>>> echo "Done: inspect the strace output file for error behaviour (grep >>>>> error ; Look at last read()-call )" >>>>> losetup -d /dev/loop1 >>>>> >>>>> ================= >>>>> >>>>> Actual results: >>>>> - When tar hits a disk read error when reading file from disk and >>>>> creating an archive, then it prints "file shrank" >>>>> - then it writes zeros (aka padding) according to initial file size (but >>>>> does not print that message) >>>>> - This happens in most cases (due to tar-block-size / disk-block-size / >>>>> read-shift-by-512-bytes interaction) >>>>> - I provided a reproducer which shows under which circumstances it >>>>> correctly prints "Read error at byte…" >>>>> >>>>> Expected results: >>>>> - When there is a read error, THEN tar shall report a read error >>>>> - When there is a read error, THEN tar shall NOT report a "file shrank" >>>>> - In addition it SHALL print "Padding with zeros". This is missing >>>>> currently. >>>>> >>>>> >>>>> Regards, >>>>> >>>>> Ondrej Dubaj >>>>> >>>>>