------- Original Message -------
On Friday, May 5th, 2023 at 3:23 PM, Andrew Dunstan <and...@dunslane.net> wrote:

> On 2023-05-05 Fr 06:02, gkokola...@pm.me wrote:
>
>> ------- Original Message -------
>> On Friday, May 5th, 2023 at 8:00 AM, Alexander Lakhin
>> [<exclus...@gmail.com>](mailto:exclus...@gmail.com)
>> wrote:
>>
>>> 23.03.2023 20:10, Tomas Vondra wrote:
>>>
>>>> So pushed all three parts, after updating the commit messages a bit.
>>>>
>>>> This leaves the empty-data issue (which we have a fix for) and the
>>>> switch to LZ4F. And then the zstd part.
>>>
>>> I'm sorry that I haven't noticed/checked that before, but when trying to
>>> perform check-world with Valgrind I've discovered another issue presumably
>>> related to LZ4File_gets().
>>> When running under Valgrind:
>>> PROVE_TESTS=t/002_pg_dump.pl make check -C src/bin/pg_dump/
>>> I get:
>>> ...
>>> 07:07:11.683 ok 1939 - compression_lz4_dir: glob check for
>>> .../src/bin/pg_dump/tmp_check/tmp_test_HB6A/compression_lz4_dir/*.dat.lz4
>>> # Running: pg_restore --jobs=2 
>>> --file=.../src/bin/pg_dump/tmp_check/tmp_test_HB6A/compression_lz4_dir.sql
>>> .../src/bin/pg_dump/tmp_check/tmp_test_HB6A/compression_lz4_dir
>>>
>>> ==00:00:00:00.579 2811926== Conditional jump or move depends on 
>>> uninitialised value(s)
>>> ==00:00:00:00.579 2811926== at 0x4853376: rawmemchr 
>>> (vg_replace_strmem.c:1548)
>>> ==00:00:00:00.579 2811926== by 0x4C96A67: _IO_str_init_static_internal 
>>> (strops.c:41)
>>> ==00:00:00:00.579 2811926== by 0x4C693A2: _IO_strfile_read (strfile.h:95)
>>> ==00:00:00:00.579 2811926== by 0x4C693A2: __isoc99_sscanf 
>>> (isoc99_sscanf.c:28)
>>> ==00:00:00:00.579 2811926== by 0x11DB6F: _LoadLOs 
>>> (pg_backup_directory.c:458)
>>> ==00:00:00:00.579 2811926== by 0x11DD1E: _PrintTocData 
>>> (pg_backup_directory.c:422)
>>> ==00:00:00:00.579 2811926== by 0x118484: restore_toc_entry 
>>> (pg_backup_archiver.c:882)
>>> ==00:00:00:00.579 2811926== by 0x1190CC: RestoreArchive 
>>> (pg_backup_archiver.c:699)
>>> ==00:00:00:00.579 2811926== by 0x10F25D: main (pg_restore.c:414)
>>> ==00:00:00:00.579 2811926==
>>> ...
>>>
>>> It looks like the line variable returned by gets_func() here is not
>>> null-terminated:
>>> while ((CFH->gets_func(line, MAXPGPATH, CFH)) != NULL)
>>>
>>> {
>>> ...
>>> if (sscanf(line, "%u %" CppAsString2(MAXPGPATH) "s\n", &oid, lofname) != 2)
>>> ...
>>> And Valgrind doesn't like it.
>>
>> Valgrind is correct to not like it. LZ4Stream_gets() got modeled after
>> gets() when it should have been modeled after fgets().
>>
>> Please find a patch attached to address it.
>
> Isn't using memset here a bit wasteful? Why not just put a null at the end 
> after calling LZ4Stream_read_internal(), which tells you how many bytes it 
> has written?

Good point. I thought about it before submitting the patch. I concluded that 
given the complexity and operations involved in LZ4Stream_read_internal() and 
the rest of the pg_dump/pg_restore code, the memset() call will be negligible. 
However from the readability point of view, the function is a bit cleaner with 
the memset().

I will not object to any suggestion though, as this is a very trivial point. 
Please find attached a v2 of the patch following the suggested approach.

Cheers,

//Georgios

> cheers
>
> andrew
>
> --
> Andrew Dunstan
> EDB:
> https://www.enterprisedb.com
From 65dbce1eb81597e3dd44eff62d8d667b0a3322da Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokola...@pm.me>
Date: Fri, 5 May 2023 09:47:02 +0000
Subject: [PATCH v2] Null terminate the output buffer of LZ4Stream_gets

LZ4Stream_gets did not null terminate its output buffer. Its callers expected
the buffer to be null terminated so they passed it around to functions such as
sscanf with unintended consequences.

Reported-by: Alexander Lakhin<exclus...@gmail.com>
---
 src/bin/pg_dump/compress_lz4.c | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/src/bin/pg_dump/compress_lz4.c b/src/bin/pg_dump/compress_lz4.c
index 423e1b7976..0f447919b2 100644
--- a/src/bin/pg_dump/compress_lz4.c
+++ b/src/bin/pg_dump/compress_lz4.c
@@ -459,6 +459,10 @@ LZ4Stream_read_internal(LZ4State *state, void *ptr, int ptrsize, bool eol_flag)
 	if (!LZ4Stream_init(state, size, false /* decompressing */ ))
 		return -1;
 
+	/* No work needs to be done for a zero-sized output buffer */
+	if (size <= 0)
+		return 0;
+
 	/* Verify that there is enough space in the outbuf */
 	if (size > state->buflen)
 	{
@@ -636,7 +640,9 @@ LZ4Stream_gets(char *ptr, int size, CompressFileHandle *CFH)
 	LZ4State   *state = (LZ4State *) CFH->private_data;
 	int			ret;
 
-	ret = LZ4Stream_read_internal(state, ptr, size, true);
+	Assert(size > 1);
+
+	ret = LZ4Stream_read_internal(state, ptr, size - 1, true);
 	if (ret < 0 || (ret == 0 && !LZ4Stream_eof(CFH)))
 		pg_fatal("could not read from input file: %s", LZ4Stream_get_error(CFH));
 
@@ -644,6 +650,12 @@ LZ4Stream_gets(char *ptr, int size, CompressFileHandle *CFH)
 	if (ret == 0)
 		return NULL;
 
+	/*
+	 * Our caller expects the return string to be NULL terminated
+	 * and we know that ret is greater than zero.
+	 */
+	ptr[ret - 1] = '\0';
+
 	return ptr;
 }
 
-- 
2.34.1

Reply via email to