On Wed, 7 Jul 2021 at 04:00, Yura Sokolov <y.soko...@postgrespro.ru> wrote: > > Anyway, excuse me for heating this discussion cause of such > non-essential issue. > I'll try to control myself and don't proceed it further. >
Whilst it has been interesting learning and discussing all these different techniques, I think it's probably best to stick with the bitmask method, rather than making the code too complex and difficult to follow. The bitmask method has the advantage of being very simple, easy to understand and fast (fastest in many of the benchmarks, and close enough in others to make me think that the difference won't matter for our purposes). To test the current patch, I hacked up a simple SQL-callable server function: random(bigint, bigint) returns bigint, similar to the one in pgbench. After doing so, I couldn't help thinking that it would be useful to have such a function in core, so maybe that could be a follow-on patch. Anyway, that led to the following observations: Firstly, there's a bug in the existing mask_u64() code -- if pg_leftmost_one_pos64(u) returns 63, you end up with a mask equal to 0, and it breaks down. Secondly, I think it would be simpler to implement this as a bitshift, rather than a bitmask, using the high bits from the random number. That might not make much difference for xoroshiro**, but in general, PRNGs tend to be weaker in the lower bits, so it seems preferable on that basis. But also, it makes the code simpler and less error-prone. Finally, I think it would be better to treat the upper bound of the range as inclusive. Doing so makes the function able to cover all possible 64-bit ranges. It would then be easy (perhaps in another follow-on patch) to make the pgbench random() function work for all 64-bit bounds (as long as max >= min), without the weird overflow checking it currently has. Putting those 3 things together, the code (minus comments) becomes: if (range > 0) { int rshift = 63 - pg_leftmost_one_pos64(range); do { val = xoroshiro128ss(state) >> rshift; } while (val > range); } else val = 0; which reduces the complexity a bit. Regards, Dean