Hello Alvaro & Tom,

Why not then insert a "few" rows, measure size, truncate the table, compute the formula and then insert to the desired user requested size? (or insert what should be the minimum, scale 1, measure, and extrapolate what's missing). It doesn't sound too complicated to me, and targeting a size is something that I believe it's quite good for user.

The formula I used approximates the whole database, not just one table. There was one for the table, but this is only part of the issue. In particular, ISTM that index sizes should be included when caching is considered.

Also, index sizes are probably in n ln(n), so some level of approximation is inevitable.

Moreover, the intrinsic granularity of TPC-B as multiple of 100,000 rows makes it not very precise wrt size anyway.

Sure, makes sense, so my second suggestion seems more reasonable: insert with scale 1, measure there (ok, you might need to crete indexes only to later drop them), and if computed scale > 1 then insert whatever is left to insert. Shouldn't be a big deal to me.

I could implement that, even if it would lead to some approximation nevertheless: ISTM that the very large scale regression performed by Kaarel is significantly more precise than testing with scale 1 (typically a few MiB) and extrapolation that to hundreds of GiB.

Maybe it could be done with kind of an open ended dichotomy, but creating and recreating index looks like an ugly solution, and what should be significant is the whole database size, including tellers & branches tables and all indexes, so I'm not convinced. Now as tellers & branches tables have basically the same structure as accounts, it could be just scaled by assuming that it would incur the same storage per row.

Anyway, even if I do not like it, it could be better than nothing. The key point for me is that if Tom is dead set against the feature the patch is dead anyway.

Tom, would Alvaro approach be more admissible to you that a fixed formula that would need updating, keeping in mind that such a feature implies some level approximation?

--
Fabien.

Reply via email to