Fabien COELHO <coe...@cri.ensmp.fr> writes: >> I'm trying to use random_zipfian() for benchmarking of skewed data sets, >> and I ran head-first into an issue with rather excessive CPU costs.
> If you want skewed but not especially zipfian, use exponential which is > quite cheap. Also zipfian with a > 1.0 parameter does not have to compute > the harmonic number, so it depends in the parameter. Maybe we should drop support for parameter values < 1.0, then. The idea that pgbench is doing something so expensive as to require caching seems flat-out insane from here. That cannot be seen as anything but a foot-gun for unwary users. Under what circumstances would an informed user use that random distribution rather than another far-cheaper-to-compute one? > ... This is why I submitted a pseudo-random permutation > function, which alas does not get much momentum from committers. TBH, I think pgbench is now much too complex; it does not need more features, especially not ones that need large caveats in the docs. (What exactly is the point of having zipfian at all?) regards, tom lane