Hello Tom,
Here is a attempt at extending --scale so that it can be given a size.
I do not actually find this to be a good idea. It's going to be
platform-dependent, or not very accurate, or both, and thereby
contribute to confusion by making results less reproducible.
I have often wanted to have such an option for testing, with criterion
like "within shared_buffers", "within memory", "twice the available
memory", to look for behavioral changes in some performance tests.
I you want reproducible (for some definition of reproducible) and
accurate, you can always use scale with a number. The report provides the
actual scale used anyway, so providing the size is just a convenience for
the initialization phase. I agree that it cannot be really exact.
Would it be more acceptable with some clear(er)/explicit caveat?
Plus, what do we do if the backend changes table representation in
some way that invalidates Kaarel's formula altogether?
Then the formula (a simple linear regression, really) should have to be
updated?
More confusion would be inevitable.
There is no much confusion when the "scale" is reported. As for confusion,
a performance tests is influenced by dozen of parameters anyway.
Now if you do not want such a feature, you can mark it as rejected, and we
will keep on trying to guess or look for the formula till the end of
time:-)
--
Fabien.