Hi Ilya,

Regarding the throttling question, I have not yet looked at thread dumps -
the observed behaviour has been seen in production metrics and logging.
What would you expect a thread dump to show in this case?

Given my description of the sizes of the data regions and the numbers of
pages being updated in a checkpoint would you expect any
throttling behaviour?

Thanks,
Raymond.

On Mon, Dec 28, 2020 at 11:53 PM Ilya Kasnacheev <ilya.kasnach...@gmail.com>
wrote:

> Hello!
>
> 1. If we knew the specific circumstances in which a specific setting value
> will yield the most benefit, we would've already set it to that value. A
> setting means that you may tune it and get better results, or not. But in
> general we can't promise you anything. I did see improvements from
> increasing this setting in a very specific setup, but in general you may
> leave it as is.
>
> 2. More frequent checkpoints mean increased write amplification. So
> reducing this value may overwhelm your system with load that it was able to
> handle previously. You can set this setting to arbitrary small value,
> meaning that checkpoints will be purely sequential without any pauses
> between them.
>
> 3. I don't think that default throttling mechanism will emit any warnings.
> What do you see in thread dumps?
>
> Regards,
> --
> Ilya Kasnacheev
>
>
> ср, 23 дек. 2020 г. в 12:48, Raymond Wilson <raymond_wil...@trimble.com>:
>
>> Hi,
>>
>> We have been investigating some issues which appear to be related to
>> checkpointing. We currently use the IA 2.8.1 with the C# client.
>>
>> I have been trying to gain clarity on how certain aspects of the Ignite
>> configuration relate to the checkpointing process:
>>
>> 1. Number of check pointing threads. This defaults to 4, but I don't
>> understand how it applies to the checkpointing process. Are more threads
>> generally better (eg: because it makes the disk IO parallel across the
>> threads), or does it only have a positive effect if you have many data
>> storage regions? Or something else? If this could be clarified in the
>> documentation (or a pointer to it which Google has not yet found), that
>> would be good.
>>
>> 2. Checkpoint frequency. This is defaulted to 180 seconds. I was thinking
>> that reducing this time would result in smaller less disruptive check
>> points. Setting it to 60 seconds seems pretty safe, but is there a
>> practical lower limit that should be used for use cases with new data
>> constantly being added, eg: 5 seconds, 10 seconds?
>>
>> 3. Write exclusivity constraints during checkpointing. I understand that
>> while a checkpoint is occurring ongoing writes will be supported into the
>> caches being check pointed, and if those are writes to existing pages then
>> those will be duplicated into the checkpoint buffer. If this buffer becomes
>> full or stressed then Ignite will throttle, and perhaps block, writes until
>> the checkpoint is complete. If this is the case then Ignite will emit
>> logging (warning or informational?) that writes are being throttled.
>>
>> We have cases where simple puts to caches (a few requests per second) are
>> taking up to 90 seconds to execute when there is an active check point
>> occurring, where the check point has been triggered by the checkpoint
>> timer. When a checkpoint is not occurring the time to do this is usually in
>> the milliseconds. The checkpoints themselves can take 90 seconds or longer,
>> and are updating up to 30,000-40,000 pages, across a pair of data storage
>> regions, one with 4Gb in-memory space allocated (which should be 1,000,000
>> pages at the standard 4kb page size), and one small region with 128Mb.
>> There is no 'throttling' logging being emitted that we can tell, so the
>> checkpoint buffer (which should be 1Gb for the first data region and 256 Mb
>> for the second smaller region in this case) does not look like it can fill
>> up during the checkpoint.
>>
>> It seems like the checkpoint is affecting the put operations, but I don't
>> understand why that may be given the documented checkpointing process, and
>> the checkpoint itself (at least via Informational logging) is not
>> advertising any restrictions.
>>
>> Thanks,
>> Raymond.
>>
>> --
>> <http://www.trimble.com/>
>> Raymond Wilson
>> Solution Architect, Civil Construction Software Systems (CCSS)
>>
>>

-- 
<http://www.trimble.com/>
Raymond Wilson
Solution Architect, Civil Construction Software Systems (CCSS)
11 Birmingham Drive | Christchurch, New Zealand
+64-21-2013317 Mobile
raymond_wil...@trimble.com

<https://worksos.trimble.com/?utm_source=Trimble&utm_medium=emailsign&utm_campaign=Launch>

Reply via email to