On Mon, Nov 4, 2019 at 1:03 PM Darafei "Komяpa" Praliaskouski
<m...@komzpa.net> wrote:
>>
>>
>> This is somewhat similar to a memory usage problem with a
>> parallel query where each worker is allowed to use up to work_mem of
>> memory.  We can say that the users using parallel operation can expect
>> more system resources to be used as they want to get the operation
>> done faster, so we are fine with this.  However, I am not sure if that
>> is the right thing, so we should try to come up with some solution for
>> it and if the solution is too complex, then probably we can think of
>> documenting such behavior.
>
>
> In cloud environments (Amazon + gp2) there's a budget on input/output 
> operations. If you cross it for long time, everything starts looking like you 
> work with a floppy disk.
>
> For the ease of configuration, I would need a "max_vacuum_disk_iops" that 
> would limit number of input-output operations by all of the vacuums in the 
> system. If I set it to less than value of budget refill, I can be sure than 
> that no vacuum runs too fast to impact any sibling query.
>
> There's also value in non-throttled VACUUM for smaller tables. On gp2 such 
> things will be consumed out of surge budget, and its size is known to 
> sysadmin. Let's call it "max_vacuum_disk_surge_iops" - if a relation has less 
> blocks than this value and it's a blocking in any way situation 
> (antiwraparound, interactive console, ...) - go on and run without throttling.
>

I think the need for these things can be addressed by current
cost-based-vacuum parameters. See docs [1].  For example, if you set
vacuum_cost_delay as zero, it will allow the operation to be performed
without throttling.

> For how to balance the cost: if we know a number of vacuum processes that 
> were running in the previous second, we can just divide a slot for this 
> iteration by that previous number.
>
> To correct for overshots, we can subtract the previous second's overshot from 
> next one's. That would also allow to account for surge budget usage and let 
> it refill, pausing all autovacuum after a manual one for some time.
>
> Precision of accounting limiting count of operations more than once a second 
> isn't beneficial for this use case.
>

I think it is better if we find a way to rebalance the cost on some
worker exit rather than every second as anyway it won't change unless
any worker exits.

[1] - 
https://www.postgresql.org/docs/devel/runtime-config-resource.html#RUNTIME-CONFIG-RESOURCE-VACUUM-COST

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Reply via email to