On Sun, Aug 16, 2015 at 07:13:07PM -0300, Pedro Arthur wrote: > 2015-08-15 7:24 GMT-03:00 Michael Niedermayer <mich...@niedermayer.cc>: > > > these are not git patches > > > Yes, they are raw git diffs. > > > > > > A - New code > > > > doesnt compile (but that doesnt matter as you say this is slower anyway) > > libswscale/swscale.c: In function ‘swscale’: > > libswscale/swscale.c:529:18: error: ‘i’ undeclared (first use in this > > function) > > > Fixed. > > time ./ffmpeg -i matrixbench_mpeg2.mpg -an -vf > > scale=1920:1080,scale=720:480 -f null - > > > Performance seems good for C. But this is not a good test for measuring > the difference > between the split vs merged color conversion and horizontal scaling as the > source slice passed > to be scaled is already in YUV format and thus there is no need for color > conversion. > Indeed the split color conversion should perform better as the code path is > shorter, instead > of calling the "process" function twice, one for color conversion and one > for h scaling, it will > call only the hscaling function. > > > > also this seems well working except > > make -j4 libswscale/swscale-test > > gdb --args libswscale/swscale-test > > > It seems the api is being used incorrectly in swscale-test.c. > > The following code creates a sws context with srcH = H / 12, srcW = W / 12. > Next it calls the scaling functions with srcY = 0, srcH = H. Thus it is > scaling more lines > than were specified when creating the context. > Is it intended or it is a bug? If it is a bug I can put a check in the > sws_scale function, if not
bug and yes, a check to avoid crashing is a good idea > I'll have to think a solution for this, as the new code expects only H/12 > lines to be scaled. > > sws = sws_getContext(W / 12, H / 12, AV_PIX_FMT_RGB32, W, H, > AV_PIX_FMT_YUVA420P, SWS_BILINEAR, NULL, NULL, NULL); > [...] > sws_scale(sws, rgb_src, rgb_stride, 0, H, src, stride); > > > I'm attaching the diff for the fixed new code with split color > conversion/hscaling > (referenced as A previously) and a new one, that I'll call D, which is A > with line batches. > Thus you can test both approaches, split/merged color conversion > with/without line > batches. > As soon as we decide which approach is better I can send a definitive patch. the batches seem always faster (D and swscale_merge_batch.patch) the difference between the split non split seems alot smaller in my tests but having the steps split seems an advantage to me so possibly the split variant might be best also feel free to split the batch addition into a seperate commit (should be easy as you already have a versionn with and without) also, please send me your public ssh key, i think you should have direct write access to ffmpeg git A2 real 0m20.942s real 0m20.995s real 0m20.989s D real 0m20.650s real 0m20.647s real 0m20.658s swscale_merge_batch.patch real 0m20.730s real 0m20.663s real 0m20.601s time ./ffmpeg -i matrixbench_mpeg2.mpg -an -vf format=bgr32,scale=1920:1080,scale=720:480,format=bgr32 -t 30 -f null - swscale_merge_batch.patch real 0m34.334s real 0m34.346s real 0m34.339s D real 0m34.509s real 0m34.470s real 0m34.637s ref: real 0m34.579s real 0m34.441s real 0m34.486s time ./ffmpeg -i matrixbench_mpeg2.mpg -an -vf format=yuyv422,scale=1920:1080,scale=720:480,format=yuyv422 -t 90 -f null - real 0m18.415s D real 0m18.473s real 0m18.512s real 0m18.484s swscale_merge_batch.patch real 0m18.468s real 0m18.500s real 0m18.516s A2 real 0m18.653s real 0m18.582s [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB The bravest are surely those who have the clearest vision of what is before them, glory and danger alike, and yet notwithstanding go out to meet it. -- Thucydides
signature.asc
Description: Digital signature
_______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel