Steve Lhomme: > On 3/27/2019 12:18 PM, Andreas Rheinhardt via ffmpeg-devel wrote: >> By default, the data_offset member of the AVFormatInternal of the >> AVFormatContext associated with the MatroskaDemuxContext has not been >> initialized explicitly by any Matroska-specific function, so that it >> was >> initialized by default to the offset at the end of >> matroska_read_header, >> i.e. usually to the offset of the length field of the first encountered >> cluster. This meant that in case that the Matroska-specific seek-code >> fails because there are no index entries for the target track a seek to >> data_offset would be performed and ordinary parsing would start from >> there which is nonsense: The length field would be treated as EBML >> ID and >> (if the length field is not longer than four bytes (EBML numbers that >> long are rejected as invalid EBML IDs)) and whatever comes next >> would be >> treated as its EBML size although it simply isn't. >> >> Signed-off-by: Andreas Rheinhardt <andreas.rheinha...@googlemail.com> >> --- >> libavformat/matroskadec.c | 3 +++ >> 1 file changed, 3 insertions(+) >> >> diff --git a/libavformat/matroskadec.c b/libavformat/matroskadec.c >> index 49f8ff4082..f9811b54a1 100644 >> --- a/libavformat/matroskadec.c >> +++ b/libavformat/matroskadec.c >> @@ -2651,6 +2651,9 @@ static int >> matroska_read_header(AVFormatContext *s) >> pos = avio_tell(matroska->ctx->pb); >> res = ebml_parse(matroska, matroska_segment, matroska); >> } >> + /* Set data_offset as it might be needed later by >> seek_frame_generic. */ >> + if (matroska->current_id) > > I'm surprised this doesn't error out if a (level 1) ID is not found here. >There are two ways how ebml_parse can return 1 in the earlier call: It can find (and consume) a cluster ID; or it can hit EOF if it is in the mode for parsing live header files (these don't contain clusters at all). In the latter case, matroska->current_id is unset.
And if we are in the normal (non-live-header) mode and no cluster has been found, then we will end up at fail anyway (when matroska_resync eventually hits the end of input). Thinking about this, there is something wrong with this approach: A Matroska file needn't contain a cluster at all (e.g. a chapter- or tags- or attachment-only file is not against the spec if I am not mistaken). But this is orthogonal to the patch at hand. >> + s->internal->data_offset = avio_tell(matroska->ctx->pb) - 4; > > The "- 4" is OK as long as level 1 elements are always 4 bytes (which > is the case). But if matroska_resync() ever exits if it finds an EBML > Void or CRC-32 then this will break. > > The code is safe for now but may not be future proof. > As has already been said: If matroska->current_id is set at this point, then it contains a cluster ID, not an arbitrary ID, because the IDs that matroska_resync finds are fed to ebml_parse before this check and a level 1 ID different than a cluster is parsed. Nevertheless, the check should probably be changed to explicitly check for a cluster ID, just to make this more readable. >> matroska_execute_seekhead(matroska); >> if (!matroska->time_scale) >> -- >> 2.19.2 _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".