My suggestion that «count of files» is meaningless number. And «count of bytes written to the files» is useful number to know and use for capacity planning..
> 15 февр. 2021 г., в 15:59, ткаленко кирилл <tkalkir...@yandex.ru> написал(а): > > Hi, Nikolay! > > There may be a number (count of segments * segment size) or there may be a > count of segments, whichever is more convenient for the user. > > 15.02.2021, 13:14, "Nikolay Izhikov" <nizhi...@apache.org>: >> Hello, Kirill. >> >> Thanks for an answers. >> Now, I understand your intentions. >> >>> t also seems that it will be more natural to operate not just bytes but >>> multiples of a segment. >> >> Can’t agree here. >> From my point of view - it’s better to know exact number, not just «count of >> segments». >> >>> 15 февр. 2021 г., в 13:00, ткаленко кирилл <tkalkir...@yandex.ru> >>> написал(а): >>> >>> Hello, Nikolay! >>> >>> The period of one day (24h) seems more natural, you can take more or less, >>> I think that one day may not be enough, and it is worth getting the metric >>> for several days (collect statistics) for example a week. Yes, the total >>> size of the segments may not be >>> DataStorageConfiguration#getMaxWalArchiveSize, but for capacity planning, >>> accuracy is not so important to us, since the load can always change, it >>> will hurt users more if we overflow the archive and it will not be able to >>> start the node. So to say that more is better than less, it also seems that >>> it will be more natural to operate not just bytes but multiples of a >>> segment. >>> >>> In separate threads, you can discuss the metric that you propose about >>> page memory and indexes estimates. >>> >>> 14.02.2021, 11:54, "Nikolay Izhikov" <nizhi...@apache.org>: >>>> Hello, Kirill >>>> >>>> Your conclusions still not clear for me. >>>> >>>>> It is not possible for us to estimate how much space a user will need >>>>> in the archive so as not to overflow it under its load >>>>> We take the maximum 44 and multiply it by a >>>>> DataStorageConfiguration#getWalSegmentSize >>>> >>>> Why you take a single day (24h) for a standard period? Is there any >>>> rationale behind this? >>>> >>>> 1. We have `walAutoArchiveAfterInactivity` property. So WAL segment can >>>> have a size less than the maximum. >>>> 2. For CDC feature I want to introduce «WAL force rollover timeout» to >>>> make data available for a consumer in a guaranteed period [1]. >>>> >>>> Why does the user want to estimate those numbers in the first place? >>>> Are we talking about some kind of capacity planning? >>>> >>>> If yes, then maybe it will be better to have a metric for a count of >>>> bytes written in the WAL? >>>> With it, we will have an exact number of space we need for WAL. >>>> >>>> How user should estimate capacity for a page memory and indexes? >>>> >>>> [1] https://issues.apache.org/jira/browse/IGNITE-13582 >>>> >>>>> 14 февр. 2021 г., в 09:48, ткаленко кирилл <tkalkir...@yandex.ru> >>>>> написал(а): >>>>> >>>>> Hi, Nikolay! >>>>> >>>>> The user will be able to take the getLastArchivedSegmentIndex every day >>>>> and remember it and do it, say, for several days. >>>>> >>>>> For example, when starting the application, the >>>>> getLastArchivedSegmentIndex is 0, then at the end of the first day the >>>>> value will be 30 at the end of the second 55 and at the end of the third >>>>> 99. >>>>> It turns out that 30 segments were used for the first day, 25 for the >>>>> second and 44 for the third. We take the maximum 44 and multiply it by a >>>>> DataStorageConfiguration#getWalSegmentSize, and we get the possible >>>>> maximum that the archive overflow was the least likely. If the user uses >>>>> compression, then it can be subtracted from the result (result * >>>>> getMaxSizeCompressedArchivedSegment). >>>>> >>>>> 13.02.2021, 10:47, "Nikolay Izhikov" <nizhi...@apache.org>: >>>>>> Hello, Kirill. >>>>>> >>>>>>> It is not possible for us to estimate how much space a user will >>>>>>> need in the archive so as not to overflow it under its load >>>>>> >>>>>> It still not clear for me why do we need those metrics. >>>>>> Can you please, write down specific scenario - how user will use these >>>>>> metrics to estimate required WAL volume? >>>>>> >>>>>>> 12 февр. 2021 г., в 19:35, ткаленко кирилл <tkalkir...@yandex.ru> >>>>>>> написал(а): >>>>>>> >>>>>>> Hi, Nikolay! >>>>>>> >>>>>>> It is not possible for us to estimate how much space a user will >>>>>>> need in the archive so as not to overflow it under its load. And the >>>>>>> proposed metrics will allow you to make a rough estimate. >>>>>>> >>>>>>> 12.02.2021, 17:23, "Nikolay Izhikov" <nizhi...@apache.org>: >>>>>>>> Hello, Kirill. >>>>>>>> >>>>>>>> Can you, please, clarify - What question about WAL user have in >>>>>>>> mind? >>>>>>>> And what answers he(or she) gets with these new metrics? >>>>>>>> >>>>>>>>> 12 февр. 2021 г., в 14:26, ткаленко кирилл <tkalkir...@yandex.ru> >>>>>>>>> написал(а): >>>>>>>>> >>>>>>>>> Hi everyone! >>>>>>>>> At the moment, I have not found an opportunity to estimate how >>>>>>>>> many WAL segments fall into the archive, say per day. >>>>>>>>> So I created a ticket >>>>>>>>> https://issues.apache.org/jira/browse/IGNITE-14170 to add a couple of >>>>>>>>> new metrics.