Hi, Nikolay!

For internal use, leave the metric that I propose and also add the metric: 
Count of bytes logged in WAL. Why not "written" because for the mmap we cannot 
track when the physical writting will occur.

16.02.2021, 15:42, "Nikolay Izhikov" <nizhi...@apache.org>:
> Kirill.
>
> «Count of segments» is a very internal thing for a regular user.
> Regular user don’t want to know about such things.
>
> You suggest to calculate the number (space required to store WAL) with some 
> kind of rough calculation, and with the «Count of bytes written in WAL» we 
> can have exact number without any suggestions or calculations.
>
> Moreover, «Count of bytes written in WAL» is independent on internal WAL 
> implementation.
>
> So, I think exact number is always better to have then some approximation.
>
> What do you think?
>
>>  15 февр. 2021 г., в 20:45, ткаленко кирилл <tkalkir...@yandex.ru> 
>> написал(а):
>>
>>  Hi, Nikolay!
>>
>>  We set the number of segments in the working directory, we also delete by 
>> segment, it seems that this is a matter of usability. I prefer to dwell on 
>> my own version, this is a simple metric that does not hurt and you can add 
>> more as needed.
>>
>>  15.02.2021, 17:10, "Nikolay Izhikov" <nizhi...@apache.org>:
>>>  My suggestion that «count of files» is meaningless number.
>>>  And «count of bytes written to the files» is useful number to know and use 
>>> for capacity planning..
>>>
>>>>   15 февр. 2021 г., в 15:59, ткаленко кирилл <tkalkir...@yandex.ru> 
>>>> написал(а):
>>>>
>>>>   Hi, Nikolay!
>>>>
>>>>   There may be a number (count of segments * segment size) or there may be 
>>>> a count of segments, whichever is more convenient for the user.
>>>>
>>>>   15.02.2021, 13:14, "Nikolay Izhikov" <nizhi...@apache.org>:
>>>>>   Hello, Kirill.
>>>>>
>>>>>   Thanks for an answers.
>>>>>   Now, I understand your intentions.
>>>>>
>>>>>>    t also seems that it will be more natural to operate not just bytes 
>>>>>> but multiples of a segment.
>>>>>
>>>>>   Can’t agree here.
>>>>>   From my point of view - it’s better to know exact number, not just 
>>>>> «count of segments».
>>>>>
>>>>>>    15 февр. 2021 г., в 13:00, ткаленко кирилл <tkalkir...@yandex.ru> 
>>>>>> написал(а):
>>>>>>
>>>>>>    Hello, Nikolay!
>>>>>>
>>>>>>    The period of one day (24h) seems more natural, you can take more or 
>>>>>> less, I think that one day may not be enough, and it is worth getting 
>>>>>> the metric for several days (collect statistics) for example a week. 
>>>>>> Yes, the total size of the segments may not be 
>>>>>> DataStorageConfiguration#getMaxWalArchiveSize, but for capacity 
>>>>>> planning, accuracy is not so important to us, since the load can always 
>>>>>> change, it will hurt users more if we overflow the archive and it will 
>>>>>> not be able to start the node. So to say that more is better than less, 
>>>>>> it also seems that it will be more natural to operate not just bytes but 
>>>>>> multiples of a segment.
>>>>>>
>>>>>>    In separate threads, you can discuss the metric that you propose 
>>>>>> about page memory and indexes estimates.
>>>>>>
>>>>>>    14.02.2021, 11:54, "Nikolay Izhikov" <nizhi...@apache.org>:
>>>>>>>    Hello, Kirill
>>>>>>>
>>>>>>>    Your conclusions still not clear for me.
>>>>>>>
>>>>>>>>      It is not possible for us to estimate how much space a user will 
>>>>>>>> need in the archive so as not to overflow it under its load
>>>>>>>>      We take the maximum 44 and multiply it by a 
>>>>>>>> DataStorageConfiguration#getWalSegmentSize
>>>>>>>
>>>>>>>    Why you take a single day (24h) for a standard period? Is there any 
>>>>>>> rationale behind this?
>>>>>>>
>>>>>>>    1. We have `walAutoArchiveAfterInactivity` property. So WAL segment 
>>>>>>> can have a size less than the maximum.
>>>>>>>    2. For CDC feature I want to introduce «WAL force rollover timeout» 
>>>>>>> to make data available for a consumer in a guaranteed period [1].
>>>>>>>
>>>>>>>    Why does the user want to estimate those numbers in the first place?
>>>>>>>    Are we talking about some kind of capacity planning?
>>>>>>>
>>>>>>>    If yes, then maybe it will be better to have a metric for a count of 
>>>>>>> bytes written in the WAL?
>>>>>>>    With it, we will have an exact number of space we need for WAL.
>>>>>>>
>>>>>>>    How user should estimate capacity for a page memory and indexes?
>>>>>>>
>>>>>>>    [1] https://issues.apache.org/jira/browse/IGNITE-13582
>>>>>>>
>>>>>>>>     14 февр. 2021 г., в 09:48, ткаленко кирилл <tkalkir...@yandex.ru> 
>>>>>>>> написал(а):
>>>>>>>>
>>>>>>>>     Hi, Nikolay!
>>>>>>>>
>>>>>>>>     The user will be able to take the getLastArchivedSegmentIndex 
>>>>>>>> every day and remember it and do it, say, for several days.
>>>>>>>>
>>>>>>>>     For example, when starting the application, the 
>>>>>>>> getLastArchivedSegmentIndex is 0, then at the end of the first day the 
>>>>>>>> value will be 30 at the end of the second 55 and at the end of the 
>>>>>>>> third 99.
>>>>>>>>     It turns out that 30 segments were used for the first day, 25 for 
>>>>>>>> the second and 44 for the third. We take the maximum 44 and multiply 
>>>>>>>> it by a DataStorageConfiguration#getWalSegmentSize, and we get the 
>>>>>>>> possible maximum that the archive overflow was the least likely. If 
>>>>>>>> the user uses compression, then it can be subtracted from the result 
>>>>>>>> (result * getMaxSizeCompressedArchivedSegment).
>>>>>>>>
>>>>>>>>     13.02.2021, 10:47, "Nikolay Izhikov" <nizhi...@apache.org>:
>>>>>>>>>     Hello, Kirill.
>>>>>>>>>
>>>>>>>>>>      It is not possible for us to estimate how much space a user 
>>>>>>>>>> will need in the archive so as not to overflow it under its load
>>>>>>>>>
>>>>>>>>>     It still not clear for me why do we need those metrics.
>>>>>>>>>     Can you please, write down specific scenario - how user will use 
>>>>>>>>> these metrics to estimate required WAL volume?
>>>>>>>>>
>>>>>>>>>>      12 февр. 2021 г., в 19:35, ткаленко кирилл 
>>>>>>>>>> <tkalkir...@yandex.ru> написал(а):
>>>>>>>>>>
>>>>>>>>>>      Hi, Nikolay!
>>>>>>>>>>
>>>>>>>>>>      It is not possible for us to estimate how much space a user 
>>>>>>>>>> will need in the archive so as not to overflow it under its load. 
>>>>>>>>>> And the proposed metrics will allow you to make a rough estimate.
>>>>>>>>>>
>>>>>>>>>>      12.02.2021, 17:23, "Nikolay Izhikov" <nizhi...@apache.org>:
>>>>>>>>>>>      Hello, Kirill.
>>>>>>>>>>>
>>>>>>>>>>>      Can you, please, clarify - What question about WAL user have 
>>>>>>>>>>> in mind?
>>>>>>>>>>>      And what answers he(or she) gets with these new metrics?
>>>>>>>>>>>
>>>>>>>>>>>>       12 февр. 2021 г., в 14:26, ткаленко кирилл 
>>>>>>>>>>>> <tkalkir...@yandex.ru> написал(а):
>>>>>>>>>>>>
>>>>>>>>>>>>       Hi everyone!
>>>>>>>>>>>>       At the moment, I have not found an opportunity to estimate 
>>>>>>>>>>>> how many WAL segments fall into the archive, say per day.
>>>>>>>>>>>>       So I created a ticket 
>>>>>>>>>>>> https://issues.apache.org/jira/browse/IGNITE-14170 to add a couple 
>>>>>>>>>>>> of new metrics.

Reply via email to