2017-10-25 22:08 GMT+02:00 Paul B Mahol <one...@gmail.com>: > On 10/25/17, Martin Vignali <martin.vign...@gmail.com> wrote: > > 2017-10-25 21:53 GMT+02:00 Paul B Mahol <one...@gmail.com>: > > > >> On 10/25/17, Martin Vignali <martin.vign...@gmail.com> wrote: > >> > 2017-10-25 9:43 GMT+02:00 Paul B Mahol <one...@gmail.com>: > >> > > >> >> On 10/21/17, Martin Vignali <martin.vign...@gmail.com> wrote: > >> >> > Hello, > >> >> > > >> >> > In attach patch to add AVX2 version for add_bytes > >> >> > > >> >> > 0001-libavcodec-lossless_videodsp-add-add_bytes-avx2-vers : > >> >> > add AVX2 version > >> >> > > >> >> > pass fate-test for me (os 10.12, x86_64) > >> >> > > >> >> > checkasm result : (Kaby Lake) (run 10 times, and i took the fastest > >> >> > version) > >> >> > checkasm: all 2 tests passed > >> >> > add_bytes_c: 108.7 > >> >> > add_bytes_sse2: 26.5 > >> >> > add_bytes_avx2: 15.5 > >> >> > > >> >> > > >> >> > 0002-libavcodec-lossless_video_dsp-cosmetic-add-better-se: > >> >> > only cosmetic > >> >> > like the ref c function declaration in asm file is not consistent > >> >> > between > >> >> > each asm file > >> >> > i think a better separator for each function make the file easier > to > >> >> > read > >> >> > > >> >> > also add the c declaration for add bytes in comment > >> >> > > >> >> > > >> >> > Martin > >> >> > > >> >> > >> >> Are you sure 32bit alignment is actually enforced? > >> >> > >> >> > >> > Hello, > >> > > >> > I think, data used by add_bytes is always aligned > >> > because dst and src, are start of a line of an AvFrame > >> > >> Yes, but try width thats not multiple of 32. > >> _______________________________________________ > >> > >> > > Sorry, not sure i understand. > > following the doc, AVFrame->linesize, is multiple of max alignment > > > > and in the asm, loop will be repeat until, val < width > > > > Can you indicate me, the part, where you think, it's not ok ? > > I dunno. You should test it with widths not divisible by 32. >
Tested with the fate sample : vsynth3-huffyuvbgra.avi (34x34) ./ffmpeg -i ./tests/data/fate/vsynth3-huffyuvbgra.avi -f framecrc - generate same crc than ./ffmpeg -i ./tests/data/fate/vsynth3-huffyuvbgra.avi -f framecrc - -cpuflags 0 > > also try encoding cropped video. > Are you sure, encoding cropped video, have a link to the decoding dsp func ? these patch only take care about the decoding func And the encoding func of huffyuvenc (in huffyuv add add/diff_bytes16 AVX2 discussion) and losslessencdsp (not made for now), have a test for alignment of dst and src Martin _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel