Hi,

I'm experimenting with adding performance regression testing in our CI. Currently our CI has quite extensive functional coverage, but totally lacks performance testing. Given that we use pytest, I've spotted pytest-benchmark (https://pytest-benchmark.readthedocs.io/en/latest/) as a likely good candidate framework.

I've prototyped things in https://github.com/OSGeo/gdal/pull/8538

Basically, we now have a autotest/benchmark directory where performance tests can be written.

Then in the CI, we checkout a reference commit, build it and run the performance test suite in --benchmark-save mode

And then we run the performance test suite on the PR in --benchmark-compare mode with a --benchmark-compare-fail="mean:5%" criterion (which means that a test fails if its mean runtime is 5% slower than the reference one)

From what I can see, pytest-benchmark behaves correctly if tests are removed or added (that is not failing, just skipping them during comparison). The only thing one should not do is modify an existing test w.r.t the reference branch.

Does someone has practical experience of pytest-benchmark, in particular in CI setups? With virtualization, it is hard to guarantee that other things happening on the host running the VM might not interfer. Even locally on my own machine, I initially saw strong variations in timings, which can be reduced to acceptable deviation by disabling Intel Turboboost feature (echo 1 | sudo tee /sys/devices/system/cpu/intel_pstate/no_turbo)

Even

--
http://www.spatialys.com
My software is free, but my time generally not.

_______________________________________________
gdal-dev mailing list
gdal-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/gdal-dev

Reply via email to