On Mon, Nov 24, 2014 at 12:35:58PM +0100, Daniel Oberhoff wrote: > inout -> filter1 -> filter2 -> output > > some threads processing frame n in the output (i.e. encoding), other threads > procesing frame n+1 in filter2, others processing frame n+2 in filter1, and > yet others processing frame n+3 decoding. This way non-parallel filters can > be sped up, and diminishing returns for too much striping can be avoided. > With modern cpus scaling easily up to 24 hardware threads I see this as > neccessary to fully utilize the hardware.
Keep in mind the two things: 1) It only works for cases where many filters are used, which is not necessarily a common case 2) While it would possibly be simpler to implement, you do not want each filter to use its own thread. This leads to massive bouncing of data between caches and especially for filters that use in-place processing a large amount of cache coherency traffic. Ideally, when used with frame multithreading you would even re-use the thread that did the decoding. _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel