On Thu, May 28, 2015 at 7:39 PM, Stefano Sabatini <stefa...@gmail.com> wrote: > On date Monday 2015-05-18 13:26:56 +0200, Stefano Sabatini encoded: >> On Mon, May 18, 2015 at 1:17 PM, Hendrik Leppkes <h.lepp...@gmail.com> >> wrote: >> >> > On Mon, May 18, 2015 at 12:37 PM, Stefano Sabatini <stefa...@gmail.com> >> > wrote: >> > >> [...] >> >> > > >> > > I have a first hackish patch, performed some tests and I got some >> > > significant performance gains, on my iCore5 with Intel Graphics HD4000 I >> > > have now the same performance as the software decoder using DXVA2 for >> > > decoding a H.264 1920x1080 video, but using only a single thread. The >> > patch >> > > as is is a hack, since I had to modify the compilation flags to enable >> > > assembly compilation in the ffmpeg_dxva2.c file. I should probably create >> > > an optimized copy function in libavutil, comments are welcome. >> > >> > FWIW, I never saw any benefits from using a small cache over simply >> > copying directly to the destination memory, that could potentially >> > simplify this a bit. >> > >> >> >> > And yeah, its a huge hack, we don't want new inline assembly. >> > >> >> The sanest approach is probably to add a function to libavutil. The >> optimized copy would then be accessible to third-party library users, with >> no assembly hacks involved. > > New patch attached, it's still somehow hackish, please advice if you > consider this approach acceptable. >
The general concept is fine, but it should not use inline asm, and someone will want to argue about the name and placement etc... :) _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel