Fabien COELHO <coe...@cri.ensmp.fr> writes: > [ pgbench-zipf-doc-3.patch ]
I started to look through this, and the more I looked the more unhappy I got that we're having this discussion at all. The zipfian support in pgbench is seriously over-engineered and under-documented. As an example, I was flabbergasted to find out that the end-of-run summary statistics now include this: /* Report zipfian cache overflow */ for (i = 0; i < nthreads; i++) { totalCacheOverflows += threads[i].zipf_cache.overflowCount; } if (totalCacheOverflows > 0) { printf("zipfian cache array overflowed %d time(s)\n", totalCacheOverflows); } What is the point of that, and if there is a point, why is it nowhere mentioned in pgbench.sgml? What would a user do with this information, and how would they know what to do? I remain of the opinion that we ought to simply rip out support for zipfian with s < 1. It's not useful for benchmarking purposes to have a random-number function with such poor computational properties. I think leaving it in there is just a foot-gun: we'd be a lot better off throwing an error that tells people to use some other distribution. Or if we do leave it in there, we for sure have to have documentation that *actually* explains how to use it, which this patch still doesn't. There's nothing suggesting that you'd better not use a large number of different (n,s) combinations. regards, tom lane