On Mon, Nov 24, 2014 at 12:35:58PM +0100, Daniel Oberhoff wrote:
> inout -> filter1 -> filter2 -> output
> 
> some threads processing frame n in the output (i.e. encoding), other threads 
> procesing frame n+1 in filter2, others processing frame n+2 in filter1, and 
> yet others processing frame n+3 decoding. This way non-parallel filters can 
> be sped up, and diminishing returns for too much striping can be avoided. 
> With modern cpus scaling easily up to 24 hardware threads I see this as 
> neccessary to fully utilize the hardware.

Keep in mind the two things:
1) It only works for cases where many filters are used, which is not
necessarily a common case
2) While it would possibly be simpler to implement, you do not want each
filter to use its own thread. This leads to massive bouncing of data between
caches and especially for filters that use in-place processing a large
amount of cache coherency traffic.
Ideally, when used with frame multithreading you would even re-use the
thread that did the decoding.
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Reply via email to