James Almer: > On 3/14/2021 7:34 AM, Anton Khirnov wrote: >> Quoting James Almer (2021-03-12 18:24:47) >>> >>> The padding exists AFAIK because something, like an optimized bitstream >>> reader that tries to process several bytes at the same time, may end up >>> reading or writing pass the reported end of the buffer. >>> That means that if reading and it's not all zeroes, it could in theory >>> have unpredictable results. Hence why everything always zeroes the >>> padding, even av_shrink_packet(). >> >> On that topic, I've been wondering about eliminating this requirement. >> Does anyone know which code it is precisely that depends on the padding >> being zeroed? Is this optimization really worth it? >> It seems rather silly to jump through all these hoops for an >> unmeasurable benefit in one decoder. > > Some subtitle demuxers didn't look at pkt->size and depended on the > padding to work as a 0 string terminator, but Marton fixed that in a > patchset sent yesterday.
You mean subtitle decoders; some subtitle demuxers have a different bug: They only zero-terminate their extradata with a single '\0', not with AV_INPUT_BUFFER_PADDING_SIZE. See ff_bprint_to_codecpar_extradata(). And ff_startcode_find_candidate_c() also requires that data is zero-terminated immediately after the buffer. It is trivial to fix. (My old patchset would actually avoid overreading altogether.) > > Beyond that, i think the v210 decoder and encoder read and write past > the end of the buffer if you use the simd functions. > >> >> It would also give us zero-copy packet splitting. >> >>> >>> And yes, what you describe is what some bitstream filters and the >>> matroska demuxer do. They just create several packet references pointing >>> to the same data buffer, but using different offsets for the data >>> pointer. They all have "padding", but in many cases said padding is the >>> beginning of the payload of another packet, and that's not ideal. >>> >>>> >>>>> >>>>> I'm sure there are other similar valid scenarios where attempting to >>>>> shrink a non writable packet is not "bad API usage". >>>>> >>>>>> >>>>>> If you want to shrink a writable packet, then maybe you don't even >>>>>> need zero padding, because the buffer possibly already contains some >>>>>> defined value, and the main reason of zero padding is avoiding >>>>>> reading uninitialized data... >>>>> >>>>> If i allocate a payload of size 1024, ultimately fill only 512 bytes, >>>>> then resize it to that value (typical scenario in lavf demuxers), if i >>>>> don't zero the bytes after the 512 offset then they will remain >>>>> uninitialized. >>>> >>>> I am not sure I see your point here, if the data after the padding is >>>> uninitalized, that is not a problem. I made a typo above, and meant >>>> non-writable packet in my comment. And my reasoning is that if a packet >>>> is non-writable, it already contains initialized data with a zero >>>> padding. If you shrink that by reducing pkt->size only, you will not >>>> have uninitialized data, only the padding will not be zeroed. And that >>>> is preferable to copying the data only for zeroing the padding, because >>>> - as far as I know - the padding does not have to be zeroed, it only >>>> have to be initialized. >>>> >>>> I agree that it is not nice that av_shrink_packet() writes something to >>>> the packet, people may not think about it and misuse it instead of >>>> directly decreaseing pkt->size when they need a partial packet. That is >>>> why I suggested the assert(). One might argue that it kind of breaks >>>> behaviour, but I'd say it does not break it too much, and writing to a >>>> non-writable packet was broken in the first place. >>>> >>>> Alternatively we can change av_shrink_packet() to only zero the padding >>>> if the packet is writable, which has the benefit that it will do what >>>> people generally expect, no matter if you throw a writable or a >>>> non-writable packet to it. >>>> >>>> A third alternative is to introduce void av_shrink_packet2() in which >>>> you explicitly document that a writable packet is needed and do the >>>> assert there, and deprecate av_shrink_packet(). >>> >>> Not a fan of functions with a 2 suffix. And to be honest, I really don't >>> care about what we do with av_shrink_packet(). >>> Do you want to keep it around? Ok. Want to deprecate and remove it? >>> Better. Want to add an assert to it? Not a fan and it may result in >>> unexpected crashes for library users, but whatever. >> >> I would suggest keeping av_shrink_packet() with a big scary warning that >> it does not check for ownership and it is the caller's responsibility to >> ensure that writing to the packet is safe. > > If we can remove the zero-padding requirement, or the padding > requirement altogether, then that would no longer be necessary. > >> >> Also, I'd like to emphasise that av_*_is_writable() is not gospel. It is >> merely a convention that is used "by default", when you don't have more >> accurate information. >> Some bits of code may use other conventions and consider a buffer >> writable even if av_buffer_is_writable() returns 0. For example this is >> at the core of frame threading, where a reference to a frame currently >> being decoded is propagated to other threads before decoding finishes. >> The owner thread then writes to the frame because frame-mt conventions >> allow it to, even though there are multiple references to the frame. >> > > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe". _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".