2019-01-06 23:55 GMT+01:00, Peter Ross <pr...@xvid.org>: > On Sun, Jan 06, 2019 at 12:57:37PM +0100, Carl Eugen Hoyos wrote: >> 2019-01-06 12:12 GMT+01:00, Peter Ross <pr...@xvid.org>: >> > for the '127-bit shift left' algorithm to work as intended, >> > little-endian >> > reads and writes must be used. >> > >> Why not using AV_WL64() and and AV_RL64()? > > good question. > >> Is there a measurable speed difference? > > x86: no difference, compiler output is identical. > > other cpus that do not support unaligned reads: big difference, due to the > compiler inserting additional instructions to check the alignment of the > data. > > mipsel: > RN64A: bench: utime=105.902s stime=0.040s rtime=105.933s > RN64: bench: utime=230.055s stime=0.004s rtime=230.082s > > why so much difference? the 127-bit shift left operation must happen for > each 1-bit DSD sample. in a single channel of DSD audio, there are at > least 2.8 millions 1-bit samples per second.
Sounds to me as if we would need AV_R[BL]xxA which is aligned for native endian (only). But this may be unrelated to your patch, you could add a comment tough that this makes a difference. Thank you, Carl Eugen _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel