Hi Spyridon, > The column "single_byte_col" is supposed to store only 1 byte. Nevertheless, > the INSERT command implicitly casts the '🀆' text into "char". This means that > only the first byte of '🀆' ends up stored in the column. > gdb reports that "pg_mblen(p) = 4" (line 1046), which is expected since the > pg_mblen('🀆') is indeed 4. Later at line 1050, the memcpy will copy 4 bytes > instead of 1, hence an out of bounds memory read happens for pointer 's', > which effectively copies random bytes.
Many thanks for reporting this! > - OS: Ubuntu 20.04 > - PSQL version 14.4 I can confirm the bug exists in the `master` branch as well and doesn't depend on the platform. Although the bug is easy to fix for this particular case (see the patch) I'm not sure if this solution is general enough. E.g. is there something that generally prevents pg_mblen() from doing out of bound reading in cases similar to this one? Should we prevent such an INSERT from happening instead? -- Best regards, Aleksander Alekseev
v1-0001-Fix-out-of-bounds-memory-reads-in-text_substring.patch
Description: Binary data