On Sun, 30 Apr 2017 23:55:02 +0700 Muhammad Faiz <mfc...@gmail.com> wrote:
> On Thu, Apr 27, 2017 at 3:51 PM, Nicolas George <geo...@nsup.org> wrote: > > L'octidi 8 floréal, an CCXXV, Michael Niedermayer a écrit : > >> I agree > >> in fact i added such a flag in 2011 > >> (4d34b6c1a1254850e39a36f08f4d2730092a54db) > >> within the API of that time to avfilter > > > > It was not a bad idea, but it should not be limited to filters. A few > > comments. > > > > * First, the framequeue framework does not produce unaligned code. > > According to the C standard, the data it handles stay aligned. The > > alignment problems come from non-standard requirements by special > > processor features used by some filters and codecs, but not all. > > > > * That means a lot of the most useful codecs and filters will suffer > > from it, but not all. For many tasks, the alignment is just fine, and > > the extra copy would be wasteful. > > > > * The alignment requirements increase. Before AVX, it was up to 16, now > > it can be 32, and I have no doubt future processor will at some point > > require 64 or 128. But realigning buffers used with SSE to 32 would be > > wasteful too. Thus, we do not require a flag but a full integer. > > > > * The code that does the actual work of realigning a buffer should > > available as a stand-alone API, to be used by applications that work at > > low-level. I suppose something like that would be in order: > > > > int av_frame_realign(AVFrame *frame, unsigned align); > > > > Or maybe: > > > > int av_frame_realign(AVFrame *frame, unsigned align, > > AVBufferAllocator *alloc); > > > > where AVBufferAllocator is there to allocate possibly hardware or mmaped > > buffers. > > > > * It is another argument for my leitmotiv that filters and codecs are > > actually the same and should be merged API-wise. > > > > * It would be better to have the API just work for everything rather > > than documenting the alignment needs. > > > > As for the actual implementation, I see a lot of different approaches: > > > > - have the framework realing the frame before submitting it to the > > filters and codecs: costly in malloc() and memcpy() but simple; > > > > - have each filter or codec call av_frame_realign() as needed; it may > > seem less elegant than the previous proposal, but it may prove a > > better choice in the light of what follows; > > > > - have each filter or codec copy the unaligned data into a buffer > > allocated once and for all or on the stack, possibly by small chunks: > > less costly in malloc() and refcounting overhead, and possibly better > > cache-locality, but more complex code; > > > > - run the plain C version of the code on unaligned data rather than the > > vectorized version, or the less-vectorized version (SSE vs AVX) on > > insufficiently aligned data. > > > > Since all this boils down to a matter of performance and is related to > > the core task of FFmpeg, I think the choice between the various options > > should be done on a case-by-case basis using real benchmarks. > > Are you working on these? Because currently I'm not. I think it's probably ok to push your current patch, since the original author of the patch who broke this probably won't fix it. _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel