On Mon, 21 Mar 2022, Ben Avison wrote:
On 18/03/2022 19:10, Andreas Rheinhardt wrote:
Ben Avison:
+static int vc1_unescape_buffer_neon(const uint8_t *src, int size, uint8_t
*dst)
+{
+ /* Dealing with starting and stopping, and removing escape bytes, are
+ * comparatively less time-sensitive, so are more clearly expressed
using
+ * a C wrapper around the assembly inner loop. Note that we assume a
+ * little-endian machine that supports unaligned loads. */
You should nevertheless use AV_RL32 for your unaligned LE loads
Thanks - I wasn't aware of that. I'll add it in.
1. You should add some benchmarks to the commit message.
Do you mean for each commit, or this one in particular? Are there any
particular standard files you'd expect to see benchmarked, or will the ones I
used in the cover-letter do?
With checkasm tests available, it'd be nice to have per-function
benchmarks in each of the patches that adds/tweaks a new function - so
you can see e.g. that the NEON version of a function is e.g. 8x faster
than the corresponding C function. That usually verifies that this
particular assembly function is beneficial (there have been cases where
people have contributed code which turned out to be slower than what the C
compiler produces).
Then overall, it can probably be nice to have a high level benchmark in
e.g. the cover letter, like "speeds up decoding <random clip> from xx fps
to yy fps on hardware zz".
(I'll make a longer reply to the other mail.)
// Martin
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".