Lynne via ffmpeg-devel <ffmpeg-devel@ffmpeg.org> writes: > On 13/08/2024 16:03, J. Dekker wrote: >> Port dav1d's checkasm output format to FFmpeg's checkasm, includes >> relative speedups and aligns results. >> Signed-off-by: J. Dekker <j...@itanimul.li> >> --- >> tests/checkasm/checkasm.c | 53 +++++++++++++++++++++++++++++++++++---- >> 1 file changed, 48 insertions(+), 5 deletions(-) >> diff --git a/tests/checkasm/checkasm.c b/tests/checkasm/checkasm.c >> index f82ee0864f..0095758268 100644 >> --- a/tests/checkasm/checkasm.c >> +++ b/tests/checkasm/checkasm.c >> @@ -18,6 +18,31 @@ >> * You should have received a copy of the GNU General Public License along >> * with FFmpeg; if not, write to the Free Software Foundation, Inc., >> * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA. >> + * >> + * Copyright © 2018, VideoLAN and dav1d authors >> + * Copyright © 2018, Two Orioles, LLC >> + * All rights reserved. >> + * >> + * Redistribution and use in source and binary forms, with or without >> + * modification, are permitted provided that the following conditions are >> met: >> + * >> + * 1. Redistributions of source code must retain the above copyright >> notice, this >> + * list of conditions and the following disclaimer. >> + * >> + * 2. Redistributions in binary form must reproduce the above copyright >> notice, >> + * this list of conditions and the following disclaimer in the >> documentation >> + * and/or other materials provided with the distribution. >> + * >> + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS >> IS" AND >> + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE >> IMPLIED >> + * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE >> + * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE >> LIABLE FOR >> + * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL >> DAMAGES >> + * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR >> SERVICES; >> + * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED >> AND >> + * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR >> TORT >> + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF >> THIS >> + * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. >> */ >> #include "config.h" >> @@ -575,6 +600,16 @@ static int measure_nop_time(void) >> return nop_sum / 500; >> } >> +static inline double avg_cycles_per_call(const CheckasmPerf *const p) >> +{ >> + if (p->iterations) { >> + const double cycles = (double)(10 * p->cycles) / p->iterations - >> state.nop_time; >> + if (cycles > 0.0) >> + return cycles / 4.0; /* 4 calls per iteration */ >> + } >> + return 0.0; >> +} >> + >> /* Print benchmark results */ >> static void print_benchs(CheckasmFunc *f) >> { >> @@ -584,17 +619,25 @@ static void print_benchs(CheckasmFunc *f) >> /* Only print functions with at least one assembly version */ >> if (f->versions.cpu || f->versions.next) { >> CheckasmFuncVersion *v = &f->versions; >> + const CheckasmPerf *p = &v->perf; >> + const double baseline = avg_cycles_per_call(p); >> + double decicycles; >> do { >> - CheckasmPerf *p = &v->perf; >> if (p->iterations) { >> - int decicycles = (10*p->cycles/p->iterations - >> state.nop_time) / 4; >> + p = &v->perf; >> + decicycles = avg_cycles_per_call(p); >> if (state.csv) { >> const char sep = state.tsv ? '\t' : ','; >> - printf("%s%c%s%c%d.%d\n", f->name, sep, >> + printf("%s%c%s%c%.1f\n", f->name, sep, >> cpu_suffix(v->cpu), sep, >> - decicycles / 10, decicycles % 10); >> + decicycles / 10.0); >> } else { >> - printf("%s_%s: %d.%d\n", f->name, >> cpu_suffix(v->cpu), decicycles/10, decicycles%10); >> + const int pad_length = 10 + 50 - >> + printf("%s_%s:", f->name, cpu_suffix(v->cpu)); >> + const double ratio = decicycles ? >> + baseline / decicycles : 0.0; >> + printf("%*.1f (%5.2fx)\n", FFMAX(pad_length, 0), >> + decicycles / 10.0, ratio); >> } >> } >> } while ((v = v->next)); > > How does it improve it? > > You're only interested in the last X iterations, after cache has fully warmed > up and is out of the equation. Averaging all results from all iteration would > be also benchmarking the memory layout of the system, but only the cycles are > of interest.
The semantics of how the benchmark is calculated is essentially unchanged by this commit. The only intention is to follow dav1d's checkasm format and include a relative speedup for easier at-a-glance comparisons. -- jd _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".