Hi Jeevan,

On Wed, Aug 28, 2019 at 10:26 PM Jeevan Chalke <
jeevan.cha...@enterprisedb.com> wrote:

>
>
> On Tue, Aug 27, 2019 at 10:33 PM Robert Haas <robertmh...@gmail.com>
> wrote:
>
>> While reviewing a proposed patch to basebackup.c this morning, I found
>> myself a bit underwhelmed by the quality of the code and comments in
>> basebackup.c's sendFile(). I believe it's already been pointed out
>> that the the retry logic here is not particularly correct, and the
>> comments demonstrate a pretty clear lack of understanding of how the
>> actual race conditions that are possible here might operate.
>>
>> However, I then noticed another problem which I think is significantly
>> more serious: this code totally ignores the possibility of a failure
>> in fread().  It just assumes that if fread() fails to return a
>> positive value, it's reached the end of the file, and if that happens,
>> it just pads out the rest of the file with zeroes.  So it looks to me
>> like if fread() encountered, say, an I/O error while trying to read a
>> data file, and if that error were on the first data block, then the
>> entire contents of that file would be replaced with zero bytes in the
>> backup, and no error or warning of any kind would be given to the
>> user.  If it hit the error later, everything from the point of the
>> error onward would just get replaced with zero bytes.  To be clear, I
>> think it is fine for basebackup.c to fill out the rest of the expected
>> length with zeroes if in fact the file has been truncated during the
>> backup; recovery should fix it.  But recovery is not going to fix data
>> that got eaten because EIO was encountered while copying a file.
>>
>
> Per fread(3) manpage, we cannot distinguish between an error or EOF. So to
> check for any error we must use ferror() after fread(). Attached patch
> which
> tests ferror() and throws an error if it returns true.
>

You are right. This seems the right approach to me. I can see at least
couple
of places already using ferror() to guard errors of fread()
like CopyGetData(),
read_backup_label() etc.


> However, I think
> fread()/ferror() both do not set errno (per some web reference) and thus
> throwing a generic error here. I have manually tested it.
>

If we are interested in getting the errno, then instead of fread(), we can
use read(). But, here, in particular, we are not decision making anything
depending on errno so I think we should be fine with your current patch.


>
>> The logic that rereads a block appears to have the opposite problem.
>> Here, it will detect an error, but it will also fail in the face of a
>> concurrent truncation, which it shouldn't.
>>
>
> For this, I have checked for feof() assuming that when the file gets
> truncated
> we reach EOF. And if yes, getting out of the loop instead of throwing an
> error.
> I may be wrong as I couldn't reproduce this issue.
>

I had a look at the patch and this seems correct to me.

Few minor comments:

+ /* Check fread() error. */
+ CHECK_FREAD_ERROR(fp, pathbuf);
+

The comments above the macro call at both the places are not necessary as
your macro name itself is self-explanatory.

----------
+ /*
+ * If file is truncated, then we will hit
+ * end-of-file error in which case we don't
+ * want to error out, instead just pad it with
+ * zeros.
+ */
+ if (feof(fp))

The if block does not do the truncation right away, so I think the comment
above can be reworded to explain why we reset cnt?

Regards,
Jeevan Ladhe

Reply via email to