I still think that the speed should compute a double to avoid integer
rounding errors within the computation. ISTM that rounding should only be
done for display in the end.

Ok, also done this way. I decided to print only full MB and not any
further digits as those don't seem very relevant.

For the computation to be in double, you must convert to double somewhere in the formula, otherwise it is computed as ints and only cast as a double afterwards. Maybe:

  current_speed = (1000.0 / MEGABYTES) * current_size / elapsed;

Or anything which converts to double early.

Instead of casting percent to int, maybe use "%.0f" as well, just like current_speed?

 +  if ((blockno % 100 == 0) && show_progress)

I'd invert the condition to avoid a modulo when not needed.

I'm not very found of global variables, but it does not matter here and doing it differently would involve more code. It would matter though if the progress report would be shared between commands.

I'm okay with calling the report on each file even if this means every few
GB...

For my I/O throttling branch I've backed off to only call the report
every 100 blocks (and at the end of the file). I've added that logic to
this patch as well.

The 100 blocks = 800 KiB looks arbitrary. What is the rational? ISTM that postgres will tend to store a power of two number of pages on full segments, so maybe there could be a rational for 1024. Not sure.

Someone suggested ETA, and it seems rather simple to do. What about
adding it?

I don't know, just computing the ETA is easy of course. But you'd have
to fiddle with seconds/minutes/hours conversions cause the runtime could
be hours. Then, you don't want to have it bumping around weirdly when
your I/O rate changes, cf. https://xkcd.com/612/ That doesn't sounds
simpler than the signal trigger we might get rid of.

Ok, it is more complicated that it looks for large sizes if second is not the right display unit.

--
Fabien.


Reply via email to