Hi, On Wed, May 6, 2015 at 2:00 PM, Carl Eugen Hoyos <ceho...@ag.or.at> wrote:
> Ronald S. Bultje <rsbultje <at> gmail.com> writes: > > > Overall, the effect would be minor, like in the lower > > single-digit percents or perhaps even fractional percent, > > but I would absolutely expect a small performance gain > > from using p10/p12 over p16 w/ bits_per_coded_sample. > > Also note most of this would only be noticeable after > > simd optimizations; in C there would be no difference. > > Thank you for the explanation! > > I still wonder if it was a good idea to add the > formats >8 and <16... I think the biggest issue with going with 16 and using bits_per_coded_sample, is to enforce that the lowest bits are actually zero. In practice, what I foresee is that every DSP operation would spend a two cycles per set of pixels to downshift and upshift, or a few (I don't know exactly how many) cycles to mask the lowest bits to zero (like val &= 0xffc0 for 10 bpp) every time the codec ABI requires it. For older codecs where exact reconstruction isn't defined (like MPEG-1/2), this wouldn't matter, but for h264, hevc, vp9 and alike codecs, this would be a headache, not so much just in terms of performance, but in terms of actually getting the decoder to work (or the encoder to produce optimal results, which might be even harder). Having said that, I agree having 100s of AV_PIX_FMT_ defines isn't ideal either. I wish there was a different way, but I can't really think of from the top of my head. Ronald _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel