On Thu, May 10, 2018 at 4:08 PM, Jan Ekström <jee...@gmail.com> wrote: > On Fri, May 11, 2018 at 1:39 AM, Alex Converse <alex.conve...@gmail.com> > wrote: >> On Thu, May 3, 2018 at 10:58 AM, Jan Ekström <jee...@gmail.com> wrote: >>> >>> Hi, >>> >>> While the timestamp field is 8+24=32 bits, the DTS value is int64_ts. >>> Thus, while I'm really bad at doing wrap-around logic, wouldn't it be >>> something a la ´uint32_t wrapped_timestamp = (uint32_t)(ts % >>> UINT32_MAX)`? >>> >> >> See the note about timestamps below. >> >>> Additionally, I did briefly go through E.4.1 (FLV Tag) in the spec and >>> it doesn't really seem to define whether this value is PTS or DTS... >>> Is this defined anywhere proper? Given FLV's age I wouldn't be >>> surprised that the answer would be "effectively DTS", of course. But >>> if you've seen what Adobe's implementation writes with B-frames it'd >>> be interesting to see. >>> >> >> FLV AVC packets have a field called CompositionTime described to match >> 14496-12. FLV requires this composition offset to be zero for sequence header >> type packets. This strongly indicates to me that we want DTS here. In >> addition >> the current muxer and demuxer already use dts pretty consistently for this >> value (and use cts = pts - dts). >> > > Alright, so dts it is... That only leaves out the fact that we're > stuffing int64_t into an "unsigned int", which most likely will not > warn on 64bit linux, but most likely will on various architectures > and/or systems. > > So we: > 1) take in the dts as int64_t. > 2) apply wrap-over at either 32 or 31 bits depending on if the signed > value is ever supposed to be negative. Depends on how wrap-around in > FLV is supposed to be handled (aka "Is the negative area ever supposed > to be used from the 32bit signed integer field?"). > 3) Write it there. > >> >>> > int64_t data_size; >>> > AVIOContext *pb = s->pb; >>> > FLVContext *flv = s->priv_data; >>> > @@ -492,8 +492,8 @@ static void flv_write_codec_header(AVFormatContext* >>> > s, AVCodecParameters* par) { >>> > par->codec_type == AVMEDIA_TYPE_VIDEO ? >>> > FLV_TAG_TYPE_VIDEO : FLV_TAG_TYPE_AUDIO); >>> > avio_wb24(pb, 0); // size patched later >>> > - avio_wb24(pb, 0); // ts >>> > - avio_w8(pb, 0); // ts ext >>> > + avio_wb24(pb, ts & 0xFFFFFF); >>> > + avio_w8(pb, (ts >> 24) & 0x7F); >>> >>> Looking at the spec this is bottom 24 bits of the timestamp and top 8, >>> thus why is the second one not & 0xFF ? >>> >> >> The mapping from dts to the flv timestamp we write to the stream is identical >> for to what we do for non header packets. [1] I can pull this into a separate >> function but it's only two lines so it didn't seem worth while. > > Yes, I think I commented on this that I hadn't noticed the signedness > of the value there :) So this is 100% a-OK. > >> >>> > avio_wb24(pb, 0); // streamid >>> > pos = avio_tell(pb); >>> > if (par->codec_id == AV_CODEC_ID_AAC) { >>> > @@ -756,7 +756,7 @@ static int flv_write_header(AVFormatContext *s) >>> > } >>> > >>> > for (i = 0; i < s->nb_streams; i++) { >>> > - flv_write_codec_header(s, s->streams[i]->codecpar); >>> > + flv_write_codec_header(s, s->streams[i]->codecpar, 0); >>> > } >>> > >>> >>> Is it customary that even if you've got a live stream going, that the >>> start point of a given ingest is always zero? In that case you might >>> want to take mention of the initial dts, and then write offsets >>> compared to that? Otherwise if you have an initial offset of, say, 5s, >>> and then you a 5s GOP with an update, then while the difference to the >>> initial timestamp would be 5s, using the original dts field you would >>> get the second header having a value of 10s there. Or am I >>> misunderstanding something here? >> >> The FLV file (on disk) format requires the initial DTS to be zero. For FLV >> over RTMP this is not the case. >> > > Oh, interesting... makes sense, of course, for continuing a stream > relative to the last timestamp of the previous ingest. > >> At one point the flv muxer handled that itself, but that changed to handle >> copyts better.[2] >> >> For non-zero starting DTS should header packets have non-zero >> timestamp? Maybe but that's the case irrespective of weather or >> not we offset late extradata. Meanwhile many popular demuxers >> (avformat included) special case initial extradata. > > Interesting, so in theory we should be making the initial > initialization data to be according to the DTS according to your > reading? Or is the initial extradata packet always supposed to be > zero, irrelevant of the actual initial content DTS? > > As soon as these questions can be answered I think we've got the ball > rolling on getting this in. As I mentioned already, this is an > improvement. >
There are some subtle differences in how FLV timestamps are handled on the wire in rtmp vs in files, and I think that's what's creating most of he complexity here. In RTMP timestamps can start at any value, are unsigned, and have 32-bit roll over semantics. In FLV files timestamps "always" begin at 0 and are 32-bit "signed" but the spec doesn't give any guidance on how negative values should be interpreted or what to do in the case that the (logical) timestamp of a packet is greater than (signed) INT32_MAX. Both of these specs are a little underspecified in my opinion and the behavior of popular implementations should probably be considered. On Thu, May 10, 2018 at 4:15 PM, Jan Ekström <jee...@gmail.com> wrote: > On Fri, May 11, 2018 at 1:48 AM, Alex Converse <alex.conve...@gmail.com> > wrote: >> >> Sorry about the lateness on my part. Were there any later comments. I >> haven't seen any. (And yes I can push this myself when all the >> concerns are resolved). > > It's OK, it has been a rather busy time for me as well. > > I was already thinking if you had given up on this and if I should > just apply the changes I thought were correct and re-post it for > review myself: I think changes to broader FLV timestamp semantics ought to be in separate patches. > * making the function take in int64_t. > * DTS is most likely what's wanted so no change there. > * most likely FLV never used negative timestamps (even though the > value is signed) so: > 1.wrap-around at the 31 bits for the signed integer value, and then I have some doubts that this behavior is correct. In particular it breaks the copyts behavior that the muxer tries to respect. 31-bit wrap around is not a concept that any of the specs mention. How do players handle it. > 2. keep the 0x7F because the topmost bit of a signed field would > never be used. > * try to figure out if the initial metadata packet should specifically > be starting with a zero timestamp, or the initial DTS of the input? This maybe different for file-flv and rtmp-flv. Best Regards, Alex _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel