First of all Thomas, your suggestions are *greatly appreciated! * Right now I am getting ready for eclipse-watching in the US and am out of town and/or vacationing. But when I get back I'll soon travel to Singapore to talk at BlackHat Asia 2014 and will spend a couple of weeks in Malaysia after that.
I haven't forgotten about the batch of changes that are already backlogged. When I get back late May I plan on starting in earnest to get everything back into master. However, if either Pete or Thomas want to address the problems you can do so in a branch or, as far as I am concerned, do it in the master branch as well. You guys rock! And thanks for your work on this. Basically it is *you *who are keeping this project alive. In thanks Rocky On Sat, Apr 6, 2024 at 11:38 AM Thomas Schmitt via Libcdio-devel < libcdio-devel@gnu.org> wrote: > Hi, > > Pete Batard wrote: > > strncpy(cpy_result, p_psz_out, i_inlen); > > Known as nitpicker i want to to point out that this would avoid a memory > corruption in case of overflow but would also truncate the name, > potentially to an incomplete UTF-8 byte sequence at the end. > > I add the technical part of my private mail to Rocky of yesterday with > assessment and proposal which is in part combinable with Pete's. > > ------------------------------------------------------------------ > My assessment for now: > > The strcpy() gesture in function _iso9660_recname_to_cstring() is part > of the conversion from UCS-2 to UTF-8. > I don't see the need for strncpy(), because the result of > cdio_charset_to_utf8() seems to be 0-terminated. (At least it does not > reply a length which would be needed if no terminating 0 is appended.) > > But the memory safety depends on the caller having allocated cpy_result > with a sufficient size. > And there i see a potential problem with some languages. > > The allocation of cpy_result happens in _iso9660_dir_to_statbuf(), > line 858 ff.: > > if (!dir_len) return NULL; > > i_fname = from_711(p_iso9660_dir->filename.len); > > /* .. string in statbuf is one longer than in p_iso9660_dir's listing > '\1' */ > stat_len = sizeof(iso9660_stat_t) + i_fname + 2; > > /* Reuse multiextent p_stat if not NULL */ > if (!p_stat) { > p_stat = calloc(1, stat_len); > first_extent = true; > > The size of p_stat->filename is then (i_fname + 2), i.e. what is needed > for the UCS-2 representation of the name plus trailing 16-bit 0. > p_stat->filename becomes parameter cpy_result: > > Line 956: > if (!_iso9660_recname_to_cstring(&p_iso9660_dir->filename.str[1], > i_inlen, NULL, p_stat->filename, > u_joliet_level)) > Line 965: > if (!_iso9660_recname_to_cstring(&p_iso9660_dir->filename.str[1], > i_inlen, NULL, p_stat->filename, > u_joliet_level)) > > Normally text shrinks during the conversion from UCS-2 to UTF-8. But > there are UCS-2/UTF-16 characters which need more than 2 bytes. E.g.: > https://www.compart.com/en/unicode/U+0800 > Samaritan Letter Alaf > UTF-8 Encoding: 0xE0 0xA0 0x80 > UTF-16 Encoding: 0x0800 > > So if the UCS-2 name contains more than one of these UTF-8 3-byters the > surplus space for the 16-bit 0 can be used up and the cpy_result can > overflow, regardless of strcpy() versus strncpy(). > (Each 1-byte UTF-8 character in the conversion result adds another byte > of room for 3-byters. But i doubt that a string with samaritan letters > will contain many letters from 7-bit US-ASCII.) > > It looks like UTF-8 4-byters are not possible in UCS-2. > If we encounter UTF-16 instead, the Joliet characters which yield 4-byte > UTF-8 bytes are 4-byters themselves. (Conversion might yield poor results > if really UCS-2 is expected. But memory should be safe.) > > -------------------------------------------------------------------------- > Proposal in addition to Pete's: > > In case of HAVE_JOLIET and u_joliet_level allocate in > _iso9660_dir_to_statbuf() > 3/2 of the UCS-2 name length (i_fname of a good UCS-2 string is divisibe > by 2): > > stat_len = sizeof(iso9660_stat_t) + 3 * i_fname / 2 + 2; > > Rather an alternative to Pete's proposal: > > Further i propose to add a parameter > size_t cpy_result_size > to > _iso9660_recname_to_cstring() > and to feed it with (3 * i_fname / 2 + 2) from _iso9660_dir_to_statbuf(). > In case of overflow, a message about programming error should be emitted. > > > Have a nice day :) > > Thomas > > >