Hi, Nikolay! Have we reached a consensus?
16.02.2021, 17:09, "ткаленко кирилл" <tkalkir...@yandex.ru>: > Hi, Zhenya! > > Users can also use it, I see nothing wrong with the presence of two metrics. > > 16.02.2021, 16:50, "Zhenya Stanilovsky" <arzamas...@mail.ru.invalid>: >> Kirill, is it good practice to have a metrics for internal use? Don`t think >> so. >> +1 witk Nikolay size is more readable than abstract segments count. >> >>> Hi, Nikolay! >>> >>> For internal use, leave the metric that I propose and also add the metric: >>> Count of bytes logged in WAL. Why not "written" because for the mmap we >>> cannot track when the physical writting will occur. >>> >>> 16.02.2021, 15:42, "Nikolay Izhikov" < nizhi...@apache.org >: >>>> Kirill. >>>> >>>> «Count of segments» is a very internal thing for a regular user. >>>> Regular user don’t want to know about such things. >>>> >>>> You suggest to calculate the number (space required to store WAL) with >>>> some kind of rough calculation, and with the «Count of bytes written in >>>> WAL» we can have exact number without any suggestions or calculations. >>>> >>>> Moreover, «Count of bytes written in WAL» is independent on internal WAL >>>> implementation. >>>> >>>> So, I think exact number is always better to have then some >>>> approximation. >>>> >>>> What do you think? >>>> >>>>> 15 февр. 2021 г., в 20:45, ткаленко кирилл < tkalkir...@yandex.ru > >>>>> написал(а): >>>>> >>>>> Hi, Nikolay! >>>>> >>>>> We set the number of segments in the working directory, we also delete >>>>> by segment, it seems that this is a matter of usability. I prefer to >>>>> dwell on my own version, this is a simple metric that does not hurt and >>>>> you can add more as needed. >>>>> >>>>> 15.02.2021, 17:10, "Nikolay Izhikov" < nizhi...@apache.org >: >>>>>> My suggestion that «count of files» is meaningless number. >>>>>> And «count of bytes written to the files» is useful number to know >>>>>> and use for capacity planning.. >>>>>> >>>>>>> 15 февр. 2021 г., в 15:59, ткаленко кирилл < tkalkir...@yandex.ru > >>>>>>> написал(а): >>>>>>> >>>>>>> Hi, Nikolay! >>>>>>> >>>>>>> There may be a number (count of segments * segment size) or there >>>>>>> may be a count of segments, whichever is more convenient for the user. >>>>>>> >>>>>>> 15.02.2021, 13:14, "Nikolay Izhikov" < nizhi...@apache.org >: >>>>>>>> Hello, Kirill. >>>>>>>> >>>>>>>> Thanks for an answers. >>>>>>>> Now, I understand your intentions. >>>>>>>> >>>>>>>>> t also seems that it will be more natural to operate not just >>>>>>>>> bytes but multiples of a segment. >>>>>>>> >>>>>>>> Can’t agree here. >>>>>>>> From my point of view - it’s better to know exact number, not just >>>>>>>> «count of segments». >>>>>>>> >>>>>>>>> 15 февр. 2021 г., в 13:00, ткаленко кирилл < >>>>>>>>> tkalkir...@yandex.ru > написал(а): >>>>>>>>> >>>>>>>>> Hello, Nikolay! >>>>>>>>> >>>>>>>>> The period of one day (24h) seems more natural, you can take >>>>>>>>> more or less, I think that one day may not be enough, and it is worth >>>>>>>>> getting the metric for several days (collect statistics) for example >>>>>>>>> a week. Yes, the total size of the segments may not be >>>>>>>>> DataStorageConfiguration#getMaxWalArchiveSize, but for capacity >>>>>>>>> planning, accuracy is not so important to us, since the load can >>>>>>>>> always change, it will hurt users more if we overflow the archive and >>>>>>>>> it will not be able to start the node. So to say that more is better >>>>>>>>> than less, it also seems that it will be more natural to operate not >>>>>>>>> just bytes but multiples of a segment. >>>>>>>>> >>>>>>>>> In separate threads, you can discuss the metric that you propose >>>>>>>>> about page memory and indexes estimates. >>>>>>>>> >>>>>>>>> 14.02.2021, 11:54, "Nikolay Izhikov" < nizhi...@apache.org >: >>>>>>>>>> Hello, Kirill >>>>>>>>>> >>>>>>>>>> Your conclusions still not clear for me. >>>>>>>>>> >>>>>>>>>>> It is not possible for us to estimate how much space a user >>>>>>>>>>> will need in the archive so as not to overflow it under its load >>>>>>>>>>> We take the maximum 44 and multiply it by a >>>>>>>>>>> DataStorageConfiguration#getWalSegmentSize >>>>>>>>>> >>>>>>>>>> Why you take a single day (24h) for a standard period? Is there >>>>>>>>>> any rationale behind this? >>>>>>>>>> >>>>>>>>>> 1. We have `walAutoArchiveAfterInactivity` property. So WAL >>>>>>>>>> segment can have a size less than the maximum. >>>>>>>>>> 2. For CDC feature I want to introduce «WAL force rollover >>>>>>>>>> timeout» to make data available for a consumer in a guaranteed >>>>>>>>>> period [1]. >>>>>>>>>> >>>>>>>>>> Why does the user want to estimate those numbers in the first >>>>>>>>>> place? >>>>>>>>>> Are we talking about some kind of capacity planning? >>>>>>>>>> >>>>>>>>>> If yes, then maybe it will be better to have a metric for a >>>>>>>>>> count of bytes written in the WAL? >>>>>>>>>> With it, we will have an exact number of space we need for WAL. >>>>>>>>>> >>>>>>>>>> How user should estimate capacity for a page memory and indexes? >>>>>>>>>> >>>>>>>>>> [1] https://issues.apache.org/jira/browse/IGNITE-13582 >>>>>>>>>> >>>>>>>>>>> 14 февр. 2021 г., в 09:48, ткаленко кирилл < >>>>>>>>>>> tkalkir...@yandex.ru > написал(а): >>>>>>>>>>> >>>>>>>>>>> Hi, Nikolay! >>>>>>>>>>> >>>>>>>>>>> The user will be able to take the getLastArchivedSegmentIndex >>>>>>>>>>> every day and remember it and do it, say, for several days. >>>>>>>>>>> >>>>>>>>>>> For example, when starting the application, the >>>>>>>>>>> getLastArchivedSegmentIndex is 0, then at the end of the first day >>>>>>>>>>> the value will be 30 at the end of the second 55 and at the end of >>>>>>>>>>> the third 99. >>>>>>>>>>> It turns out that 30 segments were used for the first day, 25 >>>>>>>>>>> for the second and 44 for the third. We take the maximum 44 and >>>>>>>>>>> multiply it by a DataStorageConfiguration#getWalSegmentSize, and we >>>>>>>>>>> get the possible maximum that the archive overflow was the least >>>>>>>>>>> likely. If the user uses compression, then it can be subtracted >>>>>>>>>>> from the result (result * getMaxSizeCompressedArchivedSegment). >>>>>>>>>>> >>>>>>>>>>> 13.02.2021, 10:47, "Nikolay Izhikov" < nizhi...@apache.org >: >>>>>>>>>>>> Hello, Kirill. >>>>>>>>>>>> >>>>>>>>>>>>> It is not possible for us to estimate how much space a >>>>>>>>>>>>> user will need in the archive so as not to overflow it under its >>>>>>>>>>>>> load >>>>>>>>>>>> >>>>>>>>>>>> It still not clear for me why do we need those metrics. >>>>>>>>>>>> Can you please, write down specific scenario - how user will >>>>>>>>>>>> use these metrics to estimate required WAL volume? >>>>>>>>>>>> >>>>>>>>>>>>> 12 февр. 2021 г., в 19:35, ткаленко кирилл < >>>>>>>>>>>>> tkalkir...@yandex.ru > написал(а): >>>>>>>>>>>>> >>>>>>>>>>>>> Hi, Nikolay! >>>>>>>>>>>>> >>>>>>>>>>>>> It is not possible for us to estimate how much space a >>>>>>>>>>>>> user will need in the archive so as not to overflow it under its >>>>>>>>>>>>> load. And the proposed metrics will allow you to make a rough >>>>>>>>>>>>> estimate. >>>>>>>>>>>>> >>>>>>>>>>>>> 12.02.2021, 17:23, "Nikolay Izhikov" < nizhi...@apache.org >>>>>>>>>>>>> >: >>>>>>>>>>>>>> Hello, Kirill. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Can you, please, clarify - What question about WAL user >>>>>>>>>>>>>> have in mind? >>>>>>>>>>>>>> And what answers he(or she) gets with these new metrics? >>>>>>>>>>>>>> >>>>>>>>>>>>>>> 12 февр. 2021 г., в 14:26, ткаленко кирилл < >>>>>>>>>>>>>>> tkalkir...@yandex.ru > написал(а): >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hi everyone! >>>>>>>>>>>>>>> At the moment, I have not found an opportunity to >>>>>>>>>>>>>>> estimate how many WAL segments fall into the archive, say per >>>>>>>>>>>>>>> day. >>>>>>>>>>>>>>> So I created a ticket >>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/IGNITE-14170 to add a >>>>>>>>>>>>>>> couple of new metrics.