Fabien COELHO <coe...@cri.ensmp.fr> writes:
>> I'm trying to use random_zipfian() for benchmarking of skewed data sets, 
>> and I ran head-first into an issue with rather excessive CPU costs. 

> If you want skewed but not especially zipfian, use exponential which is 
> quite cheap. Also zipfian with a > 1.0 parameter does not have to compute 
> the harmonic number, so it depends in the parameter.

Maybe we should drop support for parameter values < 1.0, then.  The idea
that pgbench is doing something so expensive as to require caching seems
flat-out insane from here.  That cannot be seen as anything but a foot-gun
for unwary users.  Under what circumstances would an informed user use
that random distribution rather than another far-cheaper-to-compute one?

> ... This is why I submitted a pseudo-random permutation 
> function, which alas does not get much momentum from committers.

TBH, I think pgbench is now much too complex; it does not need more
features, especially not ones that need large caveats in the docs.
(What exactly is the point of having zipfian at all?)

                        regards, tom lane

Reply via email to