Hi Matthias, On Thu, Jan 9, 2020, 20:21 Matthias Apitz <g...@unixarea.de> wrote:
> Hello, > > We encounter the following problem with ESQL/C: Imagine a table with two > columns: CHAR(16) and DATE > > The CHAR column can contain not only 16 bytes, but 16 Unicode chars, > which are longer than 16 bytes if one or more of the chars is a UTF-8 > multibyte > encoded. > > If one provides in C a host structure to FETCH the data as: > > EXEC SQL BEGIN DECLARE SECTION; > struct r_d02ben_ec { > char string[17]; > char date[11]; > }; > typedef struct r_d02ben_ec t_d02ben_ec; > t_d02ben_ec *hp_d02ben, hrec_d02ben; > EXEC SQL END DECLARE SECTION; > > and fetches the data with ESQL/C as: > > EXEC SQL FETCH hc_d02ben INTO :hrec_d02ben; > > The generated C-code looks like this: > > ... > ECPGdo(__LINE__, 0, 1, NULL, 0, ECPGst_normal, "fetch hc_d02ben", > ECPGt_EOIT, > ECPGt_char,&(hrec_d02ben.string),(long)17,(long)1,sizeof( struct > r_d02ben_ec ), > ECPGt_NO_INDICATOR, NULL , 0L, 0L, 0L, > ECPGt_char,&(hrec_d02ben.date),(long)11,(long)1,sizeof( struct > r_d02ben_ec ), > ECPGt_NO_INDICATOR, NULL , 0L, 0L, 0L, > ... > > As you can see for the first item the length 17 is sent to the PG server > together with the pointer to where the data should be stored > and for the second element the length 11 is sent (which is big enough to > receive in ASCII MM.DD.YYYY and a trailing \0). > > What we now see using GDB is that for the first element all UTF-8 data > is returned, lets asume only one multibyte char, which gives 17 bytes, > not only 16, and the trailing NULL is already placed into the element for > the date. Now the function ECPGdo() returns the date as MM.DD.YYYY > into the area pointed to for the 2nd element and with this overwrites > the NULL terminator of the string[17] element. Result is later a > SIGSEGV because the expected string in string[17] is not NULL > terminated anymore :-) > > I would call it a bug, that ECPGdo() puts more than 17 bytes (16 bytes + > NULL) as return into the place pointed to by the host var pointer when > the column in the database has more (UTF-8) chars as will fit into > 16+1 byte. > > Comments? > Proposals for a solution? > > Thanks > > matthias > > > -- > Matthias Apitz, ✉ g...@unixarea.de, http://www.unixarea.de/ > +49-176-38902045 > Public GnuPG key: http://www.unixarea.de/key.pub > I would be cautious about naming this a bug as it is a classical buffer overflow (i.e. design) issue: if you have UTF-8 characters, your text is no longer 16-byte long and you should plan extra space in your variables.