Comments in line. On Fri, Jun 24, 2022 at 8:40 AM Thomas Schmitt <scdbac...@gmx.net> wrote:
> Hi, > > Ben Kohler wrote: > > I've found that for an iso > > created with (cdrtools) mkisofs -J -iso-level 3, libcdio tools like > > iso-info and iso-read are not able to handle multi-extent files. > > It's a pair of bugs in function _iso9660_dir_to_statbuf() of libcdio > source file lib/iso9660/iso9660_fs.c . > > ----------------------------------------------------------------------- > Symptoms: > > I can reproduce this with a xorriso-made ISO too, which has Rock Ridge > additionally to Joliet. > (mkisofs with -R doesn't help either. genisoimage is safe because it > does not obey -iso-level 3 and aborts already on image production.) > > A workaround for iso-info seems to be option > --no-joliet > man iso-read does not mention this option. > > A run of iso-info without --no-joliet complains: > > ++ WARN: Non consecutive multiextent file parts for '' > > ----------------------------------------------------------------------- > Diagnosis: > > The message comes from _iso9660_dir_to_statbuf() in iso9660_fs.c. > It is supposed to report a filename in the quotation marks. > In order to get there, a re-used p_stat has to be submitted by the > caller. > > The first bug is this: > > When iso9660_ifs_readdir() calls _iso9660_dir_to_statbuf() with the first > directory entry of the multi-extent file it gets back an empty string as > file name, because the conversion from UCS-2 to UTF-8 is missing if the > ISO_MULTIEXTENT bit is set in the directory record. > This is probably my fault due to the assumption that libcdio already > deals with C strings at that point of processing. > See > > https://git.savannah.gnu.org/cgit/libcdio.git/commit/?id=d758fa2253cca062e92ad0754a64c15a854db4ff > "Use the plain ISO-9660 name when dealing with a multiextent file part" > > The second bug is similar: > > The test which leads to the warning message compares the C string which > holds > the previously read file name with the UCS-2 array which holds the name of > the follow-up directory record. > So even if the UTF-8 conversion missing in above bug is done, this test > still > detects that the string length does not match the array byte size and thus > throws the warning and prevents the proper merging of both directory > records. > See > > https://git.savannah.gnu.org/cgit/libcdio.git/commit/?id=d758fa2253cca062e92ad0754a64c15a854db4ff > "Only resolve the full filename when we're not dealing with extent" > > ----------------------------------------------------------------------- > Code to test: > > I now have an ugly contraption by which iso-info recognizes a multi-extent > file even if Joliet is present. The base is an outdated repo clone. > I will have to get a fresh git state and try to find my cheat sheet > which describes how to submit changes to libcdio. > > The current diff (minus all my debugging printfs) is: > > diff --git a/lib/iso9660/iso9660_fs.c b/lib/iso9660/iso9660_fs.c > index be693c7..a84b9fc 100644 > --- a/lib/iso9660/iso9660_fs.c > +++ b/lib/iso9660/iso9660_fs.c > @@ -866,8 +866,25 @@ _iso9660_dir_to_statbuf (iso9660_dir_t *p_iso9660_dir, > if ((p_iso9660_dir->file_flags & ISO_MULTIEXTENT) == 0) { > /* Check if this is the last part of a multiextent file */ > if (!first_extent) { > - if (strlen(p_stat->filename) != i_fname || > - strncmp(p_stat->filename, &p_iso9660_dir->filename.str[1], > i_fname) != 0) { > + cdio_utf8_t *p_psz_out = NULL; > + int bad_multi; > + > +#ifdef HAVE_JOLIET > + if (u_joliet_level) { > + int i_inlen = i_fname; > + if (!cdio_charset_to_utf8(&p_iso9660_dir->filename.str[1], i_inlen, > + &p_psz_out, "UCS-2BE")) { > + goto fail; > + } > + } else > +#endif /*HAVE_JOLIET*/ > + { > + p_psz_out = calloc(1, i_fname + 1); > + strncpy (p_psz_out, &p_iso9660_dir->filename.str[1], i_fname); > + } > + bad_multi = (strcmp(p_stat->filename, p_psz_out) != 0); > + free(p_psz_out); > + if (bad_multi) { > cdio_warn("Non consecutive multiextent file parts for '%s'", > p_stat->filename); > goto fail; > @@ -916,7 +933,22 @@ _iso9660_dir_to_statbuf (iso9660_dir_t *p_iso9660_dir, > } > } else { > /* Use the plain ISO-9660 name when dealing with a multiextent file > part */ > - strncpy(p_stat->filename, &p_iso9660_dir->filename.str[1], i_fname); > +#ifdef HAVE_JOLIET > + if (u_joliet_level) { > + int i_inlen = i_fname; > + cdio_utf8_t *p_psz_out = NULL; > + if (cdio_charset_to_utf8(&p_iso9660_dir->filename.str[1], i_inlen, > + &p_psz_out, "UCS-2BE")) { > + strncpy(p_stat->filename, p_psz_out, i_fname); > + free(p_psz_out); > + } else { > + goto fail; > + } > + } else > +#endif /*HAVE_JOLIET*/ > + { > + strncpy (p_stat->filename, &p_iso9660_dir->filename.str[1], > i_fname); > + } > } > > iso9660_get_dtime(&(p_iso9660_dir->recording_time), true, > &(p_stat->tm)); > > -------------------------------------------------------------------------- > > @ Ben Kohler: > You'd do me a big favor if you could test this with your use cases. > > @ All: > I am unhappy with the code repetition in above patch (plus the original > conversion code which is already _iso9660_dir_to_statbuf()). > Abstraction proposals to unify the now three cdio_charset_to_utf8() > occasions in _iso9660_dir_to_statbuf() would be welcome. > > Thomas: thanks again for the great work and followup. My suggestion is to get your patches in now. Then in a week Pete can have something that matches what is in git master. I will try to come up with an abstraction proposal or something to remove redundancy when I get a chance, but that won't be in the next two weeks as I'll be busy and out of town. Thanks. > > Have a nice day :) > Same to you :o) > > Thomas > > >