On Fri, Feb 14, 2025 at 5:30 AM Pavel Koshevoy <pkoshe...@gmail.com> wrote:
> > > On Thu, Feb 13, 2025, 22:04 Andreas Rheinhardt < > andreas.rheinha...@outlook.com> wrote: > >> Pavel Koshevoy: >> > The problem is reproducible with "Test for Quicktime 608 CC file.mov" >> > from https://samples.ffmpeg.org/MPEG2/subcc/ >> > >> > ffmpeg -i "Test for Quicktime 608 CC file.mov" -map 0 -c copy -y >> remuxed.mov >> > >> > Prior to the fix QuickTime Player playback of remuxed.mov would >> > render garbage text for "English CC" subtitles. >> >> Is remuxing necessary for there being garbage? >> > > The original file displays correct English CC text in QuickTime Player, > and the remuxed file (prior to the fix) does not. > > > >> > --- >> > libavformat/mov.c | 70 +++++++++++++++++++++++++++++++++++++++-------- >> > 1 file changed, 59 insertions(+), 11 deletions(-) >> > >> > diff --git a/libavformat/mov.c b/libavformat/mov.c >> > index 85aef33b19..5a91ef5b8c 100644 >> > --- a/libavformat/mov.c >> > +++ b/libavformat/mov.c >> > @@ -10788,25 +10788,73 @@ static int mov_change_extradata(AVStream *st, >> AVPacket *pkt) >> > return 0; >> > } >> > >> > -static int get_eia608_packet(AVIOContext *pb, AVPacket *pkt, int size) >> > +static int get_eia608_packet(AVIOContext *pb, AVPacket *pkt, int >> src_size) >> > { >> > - int new_size, ret; >> > + /* We can't make assumptions about the structure of the payload, >> > + because it may include multiple cdat and cdt2 samples. */ >> > + const uint32_t cdat = AV_RB32("cdat"); >> > + const uint32_t cdt2 = AV_RB32("cdt2"); >> >> I don't think that using (non-variable) variables for these improves >> clarity (e.g. it means that the definition of the actual values used for >> the comparisons below is now further away from its use). Why not simply >> use MKBETAG('c','d','a','t') below? >> > > > That is a matter of personal preference. I personally find "cdat" more > readable (and searchable) than any MKBETAG. > > > >> > + int ret, out_size = 0; >> > >> > - if (size <= 8) >> > + /* a valid payload must have size, 4cc, and at least 1 byte pair: >> */ >> > + if (src_size < 10) >> > return AVERROR_INVALIDDATA; >> > - new_size = ((size - 8) / 2) * 3; >> > - ret = av_new_packet(pkt, new_size); >> > + >> > + /* avoid an int overflow: */ >> > + if ((src_size - 8) / 2 >= INT_MAX / 3) >> > + return AVERROR_INVALIDDATA; >> > + >> > + ret = av_new_packet(pkt, ((src_size - 8) / 2) * 3); >> > if (ret < 0) >> > return ret; >> > >> > - avio_skip(pb, 8); >> > - for (int j = 0; j < new_size; j += 3) { >> > - pkt->data[j] = 0xFC; >> > - pkt->data[j+1] = avio_r8(pb); >> > - pkt->data[j+2] = avio_r8(pb); >> > + /* parse and re-format the c608 payload in one pass. */ >> > + while (src_size >= 10) { >> > + const uint32_t atom_size = avio_rb32(pb); >> > + const uint32_t atom_type = avio_rb32(pb); >> > + const uint32_t data_size = atom_size - 8; >> >> This may wrap around (if atom_size is < 8). If int is 32 bits, then the >> data_size > src_size check will catch this, but in case of 64 bit ints >> it may not. Relying on (unsigned, defined) integer wraparound should be >> avoided unless it is advantageous to use it; in this case, this is just >> not true: Just compare atom_size to 10 below. >> > > I fully expect the size of uint32_t to be 32 bits, on any platform. It > should be a compile time assertio n, but that is outside the scope of this > fix. The name of the data type says it's 32 bit long, so it must be so. > > > >> > + const uint8_t cc_field = >> > + atom_type == cdat ? 1 : >> > + atom_type == cdt2 ? 2 : >> > + 0; >> > + >> > + /* account for bytes consumed for atom size and type. */ >> > + src_size -= 8; >> > + >> > + /* make sure the data size stays within the buffer boundaries. >> */ >> > + if (data_size < 2 || data_size > src_size) { >> > + ret = AVERROR_INVALIDDATA; >> > + break; >> > + } >> > + >> > + /* make sure the data size is consistent with N byte pairs. */ >> > + if (data_size % 2 != 0) { >> >> We typically try to avoid redundant "!= 0". >> > > Again, this is a matter of personal preference. If you would prefer to > tweak the patch to suit your personal preference before merging -- you are > free to do so, but I don't think it's a valid reason to delay a fix for a > parser that has been mis-parsing well-formed files for the past 5 years. > > > >> > + ret = AVERROR_INVALIDDATA; >> > + break; >> > + } >> > + >> > + if (!cc_field) { >> > + /* neither cdat or cdt2 ... skip it */ >> > + avio_skip(pb, data_size); >> > + src_size -= data_size; >> > + continue; >> > + } >> > + >> > + for (int32_t i = 0; i < data_size; i += 2) { >> >> int32_t? Why signed? (And why use a separate loop counter at all? Simply >> decrement data_size by 2 in each iteration. >> > > Please feel free to make additional improvements to whatever fix you > decide to merge. > > > >> > + pkt->data[out_size] = (0x1F << 3) | (1 << 2) | (cc_field - >> 1); >> > + pkt->data[out_size + 1] = avio_r8(pb); >> > + pkt->data[out_size + 2] = avio_r8(pb); >> > + out_size += 3; >> > + src_size -= 2; >> > + } >> > } >> > >> > - return 0; >> > + if (src_size > 0) >> > + /* skip any remaining unread portion of the input payload */ >> > + avio_skip(pb, src_size); >> > + >> > + av_shrink_packet(pkt, out_size); >> > + return ret; >> > } >> > >> > static int mov_finalize_packet(AVFormatContext *s, AVStream *st, >> AVIndexEntry *sample, >> >> Generally, I believe that reading the input into pkt->data[size / 2] >> would be advantageous: It would make it simple to check for EOF and I/O >> errors (notice that the avio_r* reads above are unchecked) and would >> read the data in one go, avoiding all the avio_skip(). >> >> - Andreas >> >> >> I've created a bug report for this issue, with screenshots demonstrating the problem and the fix https://trac.ffmpeg.org/ticket/11470 Pavel. _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".