Hi, Nikolay! For internal use, leave the metric that I propose and also add the metric: Count of bytes logged in WAL. Why not "written" because for the mmap we cannot track when the physical writting will occur.
16.02.2021, 15:42, "Nikolay Izhikov" <nizhi...@apache.org>: > Kirill. > > «Count of segments» is a very internal thing for a regular user. > Regular user don’t want to know about such things. > > You suggest to calculate the number (space required to store WAL) with some > kind of rough calculation, and with the «Count of bytes written in WAL» we > can have exact number without any suggestions or calculations. > > Moreover, «Count of bytes written in WAL» is independent on internal WAL > implementation. > > So, I think exact number is always better to have then some approximation. > > What do you think? > >> 15 февр. 2021 г., в 20:45, ткаленко кирилл <tkalkir...@yandex.ru> >> написал(а): >> >> Hi, Nikolay! >> >> We set the number of segments in the working directory, we also delete by >> segment, it seems that this is a matter of usability. I prefer to dwell on >> my own version, this is a simple metric that does not hurt and you can add >> more as needed. >> >> 15.02.2021, 17:10, "Nikolay Izhikov" <nizhi...@apache.org>: >>> My suggestion that «count of files» is meaningless number. >>> And «count of bytes written to the files» is useful number to know and use >>> for capacity planning.. >>> >>>> 15 февр. 2021 г., в 15:59, ткаленко кирилл <tkalkir...@yandex.ru> >>>> написал(а): >>>> >>>> Hi, Nikolay! >>>> >>>> There may be a number (count of segments * segment size) or there may be >>>> a count of segments, whichever is more convenient for the user. >>>> >>>> 15.02.2021, 13:14, "Nikolay Izhikov" <nizhi...@apache.org>: >>>>> Hello, Kirill. >>>>> >>>>> Thanks for an answers. >>>>> Now, I understand your intentions. >>>>> >>>>>> t also seems that it will be more natural to operate not just bytes >>>>>> but multiples of a segment. >>>>> >>>>> Can’t agree here. >>>>> From my point of view - it’s better to know exact number, not just >>>>> «count of segments». >>>>> >>>>>> 15 февр. 2021 г., в 13:00, ткаленко кирилл <tkalkir...@yandex.ru> >>>>>> написал(а): >>>>>> >>>>>> Hello, Nikolay! >>>>>> >>>>>> The period of one day (24h) seems more natural, you can take more or >>>>>> less, I think that one day may not be enough, and it is worth getting >>>>>> the metric for several days (collect statistics) for example a week. >>>>>> Yes, the total size of the segments may not be >>>>>> DataStorageConfiguration#getMaxWalArchiveSize, but for capacity >>>>>> planning, accuracy is not so important to us, since the load can always >>>>>> change, it will hurt users more if we overflow the archive and it will >>>>>> not be able to start the node. So to say that more is better than less, >>>>>> it also seems that it will be more natural to operate not just bytes but >>>>>> multiples of a segment. >>>>>> >>>>>> In separate threads, you can discuss the metric that you propose >>>>>> about page memory and indexes estimates. >>>>>> >>>>>> 14.02.2021, 11:54, "Nikolay Izhikov" <nizhi...@apache.org>: >>>>>>> Hello, Kirill >>>>>>> >>>>>>> Your conclusions still not clear for me. >>>>>>> >>>>>>>> It is not possible for us to estimate how much space a user will >>>>>>>> need in the archive so as not to overflow it under its load >>>>>>>> We take the maximum 44 and multiply it by a >>>>>>>> DataStorageConfiguration#getWalSegmentSize >>>>>>> >>>>>>> Why you take a single day (24h) for a standard period? Is there any >>>>>>> rationale behind this? >>>>>>> >>>>>>> 1. We have `walAutoArchiveAfterInactivity` property. So WAL segment >>>>>>> can have a size less than the maximum. >>>>>>> 2. For CDC feature I want to introduce «WAL force rollover timeout» >>>>>>> to make data available for a consumer in a guaranteed period [1]. >>>>>>> >>>>>>> Why does the user want to estimate those numbers in the first place? >>>>>>> Are we talking about some kind of capacity planning? >>>>>>> >>>>>>> If yes, then maybe it will be better to have a metric for a count of >>>>>>> bytes written in the WAL? >>>>>>> With it, we will have an exact number of space we need for WAL. >>>>>>> >>>>>>> How user should estimate capacity for a page memory and indexes? >>>>>>> >>>>>>> [1] https://issues.apache.org/jira/browse/IGNITE-13582 >>>>>>> >>>>>>>> 14 февр. 2021 г., в 09:48, ткаленко кирилл <tkalkir...@yandex.ru> >>>>>>>> написал(а): >>>>>>>> >>>>>>>> Hi, Nikolay! >>>>>>>> >>>>>>>> The user will be able to take the getLastArchivedSegmentIndex >>>>>>>> every day and remember it and do it, say, for several days. >>>>>>>> >>>>>>>> For example, when starting the application, the >>>>>>>> getLastArchivedSegmentIndex is 0, then at the end of the first day the >>>>>>>> value will be 30 at the end of the second 55 and at the end of the >>>>>>>> third 99. >>>>>>>> It turns out that 30 segments were used for the first day, 25 for >>>>>>>> the second and 44 for the third. We take the maximum 44 and multiply >>>>>>>> it by a DataStorageConfiguration#getWalSegmentSize, and we get the >>>>>>>> possible maximum that the archive overflow was the least likely. If >>>>>>>> the user uses compression, then it can be subtracted from the result >>>>>>>> (result * getMaxSizeCompressedArchivedSegment). >>>>>>>> >>>>>>>> 13.02.2021, 10:47, "Nikolay Izhikov" <nizhi...@apache.org>: >>>>>>>>> Hello, Kirill. >>>>>>>>> >>>>>>>>>> It is not possible for us to estimate how much space a user >>>>>>>>>> will need in the archive so as not to overflow it under its load >>>>>>>>> >>>>>>>>> It still not clear for me why do we need those metrics. >>>>>>>>> Can you please, write down specific scenario - how user will use >>>>>>>>> these metrics to estimate required WAL volume? >>>>>>>>> >>>>>>>>>> 12 февр. 2021 г., в 19:35, ткаленко кирилл >>>>>>>>>> <tkalkir...@yandex.ru> написал(а): >>>>>>>>>> >>>>>>>>>> Hi, Nikolay! >>>>>>>>>> >>>>>>>>>> It is not possible for us to estimate how much space a user >>>>>>>>>> will need in the archive so as not to overflow it under its load. >>>>>>>>>> And the proposed metrics will allow you to make a rough estimate. >>>>>>>>>> >>>>>>>>>> 12.02.2021, 17:23, "Nikolay Izhikov" <nizhi...@apache.org>: >>>>>>>>>>> Hello, Kirill. >>>>>>>>>>> >>>>>>>>>>> Can you, please, clarify - What question about WAL user have >>>>>>>>>>> in mind? >>>>>>>>>>> And what answers he(or she) gets with these new metrics? >>>>>>>>>>> >>>>>>>>>>>> 12 февр. 2021 г., в 14:26, ткаленко кирилл >>>>>>>>>>>> <tkalkir...@yandex.ru> написал(а): >>>>>>>>>>>> >>>>>>>>>>>> Hi everyone! >>>>>>>>>>>> At the moment, I have not found an opportunity to estimate >>>>>>>>>>>> how many WAL segments fall into the archive, say per day. >>>>>>>>>>>> So I created a ticket >>>>>>>>>>>> https://issues.apache.org/jira/browse/IGNITE-14170 to add a couple >>>>>>>>>>>> of new metrics.