> > As a compromise, I can add jmx methods (rebuilding indexes in the process > and the percentage of rebuilding) for the entire node, but I tried to find > a suitable place and did not find it, tell me where to add it?
I have checked existing JMX beans. To be honest, I struggle to find a suitable place as well. We have ClusterMetrics that may represent the state of a local node, but this class is also used for aggregated cluster metrics. I can't propose a reasonable way to merge percentages from different nodes. On the other hand, total index rebuild for all caches isn't a common scenario. It's either performed after manual index.bin removal or after index creation, both operations are performed on cache / cache-group level. Also, all other similar metrics are provided on cache-group level. I propose to stick with a cache-group level metric (e.g. getIndexBuildProgress) that returns a float from 0 to 1, which is calculated as [processedKeys] / [localCacheSize]. Even if a user handles metrics through Zabbix, I anticipate that he'll perform this calculation on his own in order to estimate progress. Let's help him a bit and perform it on the system side. If a per-group percentage metric is present, I think getIndexRebuildKeyProcessed becomes redundant. On Tue, Aug 11, 2020 at 8:20 AM ткаленко кирилл <tkalkir...@yandex.ru> wrote: > Hi, Ivan! > > What precision would be sufficient? > > If the progress is very slow, I don't see issues with tracking it if the > > percentage float has enough precision. > > I think we can add a mention getting cache size. > > 1. Gain an understanding that local cache size > > (CacheMetricsImpl#getCacheSize) should be used as a 100% milestone (it > > isn't mentioned neither in javadoc nor in JMX method description). > > Do you think users collect metrics with their hands? I think this is done > by other systems, such as zabbix. > > 2. Manually calculate sum of all metrics and divide to sum of all cache > > sizes. > > As a compromise, I can add jmx methods (rebuilding indexes in the process > and the percentage of rebuilding) for the entire node, but I tried to find > a suitable place and did not find it, tell me where to add it? > > On the other hand, % of index rebuild progress is self-descriptive. I > don't > > understand why we tend to make user's life harder. > > 10.08.2020, 21:57, "Ivan Rakov" <ivan.glu...@gmail.com>: > >> This metric can be used only for local node, to get size of cache use > >> > org.apache.ignite.internal.processors.cache.CacheMetricsImpl#getCacheSize. > > > > Got it, agree. > > > > If there is a lot of data in node that can be rebuilt, percentage may > >> change very rarely and may not give an estimate of how much time is > left. > >> If we see for example that 50_000 keys are rebuilt once a minute, and > we > >> have 1_000_000_000 keys, then we can have an approximate estimate. > What do > >> you think of that? > > > > If the progress is very slow, I don't see issues with tracking it if the > > percentage float has enough precision. > > Still, usability of the metric concerns me. In order to estimate > remaining > > time of index rebuild, user should: > > 1. Gain an understanding that local cache size > > (CacheMetricsImpl#getCacheSize) should be used as a 100% milestone (it > > isn't mentioned neither in javadoc nor in JMX method description). > > 2. Manually calculate sum of all metrics and divide to sum of all cache > > sizes. > > On the other hand, % of index rebuild progress is self-descriptive. I > don't > > understand why we tend to make user's life harder. > > > > -- > > Best regards, > > Ivan > > > > On Mon, Aug 10, 2020 at 8:53 PM ткаленко кирилл <tkalkir...@yandex.ru> > > wrote: > > > >> Hi, Ivan! > >> > >> For this you can use > >> org.apache.ignite.cache.CacheMetrics#IsIndexRebuildInProgress > >> > How can a local number of processed keys can help us to understand > when > >> > index rebuild will be finished? > >> > >> This metric can be used only for local node, to get size of cache use > >> > org.apache.ignite.internal.processors.cache.CacheMetricsImpl#getCacheSize. > >> > We can't compare metric value with cache.size(). First one is > node-local, > >> > while cache size covers all partitions in the cluster. > >> > >> If there is a lot of data in node that can be rebuilt, percentage may > >> change very rarely and may not give an estimate of how much time is > left. > >> If we see for example that 50_000 keys are rebuilt once a minute, and > we > >> have 1_000_000_000 keys, then we can have an approximate estimate. > What do > >> you think of that? > >> > I find one single metric much more usable. It would be perfect if > metric > >> > value is represented in percentage, e.g. current progress of local > node > >> > index rebuild is 60%. > >> > >> 10.08.2020, 19:11, "Ivan Rakov" <ivan.glu...@gmail.com>: > >> > Folks, > >> > > >> > Sorry for coming late to the party. I've taken a look at this issue > >> during > >> > review. > >> > > >> > How can a local number of processed keys can help us to understand > when > >> > index rebuild will be finished? > >> > We can't compare metric value with cache.size(). First one is > node-local, > >> > while cache size covers all partitions in the cluster. > >> > Also, I don't understand why we need to keep separate metrics for all > >> > caches. Of course, the metric becomes more fair, but obviously > harder to > >> > make conclusions on whether "the index rebuild" process is over (and > the > >> > cluster is ready to process queries quickly). > >> > > >> > I find one single metric much more usable. It would be perfect if > metric > >> > value is represented in percentage, e.g. current progress of local > node > >> > index rebuild is 60%. > >> > > >> > -- > >> > Best regards, > >> > Ivan > >> > > >> > On Fri, Jul 24, 2020 at 1:35 PM Stanislav Lukyanov < > >> stanlukya...@gmail.com> > >> > wrote: > >> > > >> >> Got it. I thought that index building and index rebuilding are > >> essentially > >> >> the same, > >> >> but now I see that they are different: index rebuilding cares about > all > >> >> indexes at once while index building cares about particular ones. > >> >> > >> >> Kirill's approach sounds good. > >> >> > >> >> Stan > >> >> > >> >> > On 20 Jul 2020, at 14:54, Alexey Goncharuk < > >> alexey.goncha...@gmail.com> > >> >> wrote: > >> >> > > >> >> > Stan, > >> >> > > >> >> > Currently we never build indexes one-by-one - we always use a > cache > >> data > >> >> > row visitor which either updates all indexes (see > >> >> IndexRebuildFullClosure) > >> >> > or updates a set of all indexes that need to catch up (see > >> >> > IndexRebuildPartialClosure). GIven that, I do not see any need for > >> >> > per-index rebuild status as this status will be updated for all > >> outdated > >> >> > indexes simultaneously. > >> >> > > >> >> > Kirill's approach for the total number of processed keys per cache > >> seems > >> >> > reasonable to me. > >> >> > > >> >> > --AG > >> >> > > >> >> > пт, 3 июл. 2020 г. в 10:12, ткаленко кирилл <tkalkir...@yandex.ru > >: > >> >> > > >> >> >> Hi, Stan! > >> >> >> > >> >> >> Perhaps it is worth clarifying what exactly I wanted to say. > >> >> >> Now we have 2 processes: building and rebuilding indexes. > >> >> >> > >> >> >> At moment, we have some metrics for rebuilding indexes: > >> >> >> "IsIndexRebuildInProgress", "IndexBuildCountPartitionsLeft". > >> >> >> > >> >> >> I suggest adding another metric "Indexrebuildkeyprocessed", which > >> will > >> >> >> allow you to determine how many records are left to rebuild for > >> cache. > >> >> >> > >> >> >> I think your comments are more about building an index that may > need > >> >> more > >> >> >> metrics, but I think you should do it in a separate ticket. > >> >> >> > >> >> >> 03.07.2020, 03:09, "Stanislav Lukyanov" <stanlukya...@gmail.com > >: > >> >> >>> If multiple indexes are to be built "number of indexed keys" > >> metric may > >> >> >> be misleading. > >> >> >>> > >> >> >>> As a cluster admin, I'd like to know: > >> >> >>> - Are all indexes ready on a node? > >> >> >>> - How many indexes are to be built? > >> >> >>> - How much resources are used by the index building (how many > >> threads > >> >> >> are used)? > >> >> >>> - Which index(es?) is being built right now? > >> >> >>> - How much time until the current (single) index building > finishes? > >> >> Here > >> >> >> "time" can be a lot of things: partitions, entries, percent of > the > >> >> cache, > >> >> >> minutes and hours > >> >> >>> - How much time until all indexes are built? > >> >> >>> - How much does it take to build each of my indexes / a single > >> index of > >> >> >> my cache on average? > >> >> >>> > >> >> >>> I think we need a set of metrics and/or log messages to solve > all > >> of > >> >> >> these questions. > >> >> >>> I imaging something like: > >> >> >>> - numberOfIndexesToBuild > >> >> >>> - a standard set of metrics on the index building thread pool > (do > >> we > >> >> >> already have it?) > >> >> >>> - currentlyBuiltIndexName (assuming we only build one at a time > >> which > >> >> is > >> >> >> probably not true) > >> >> >>> - for the "time" metrics I think percentage might be the best as > >> it's > >> >> >> the easiest to understand; we may add multiple metrics though. > >> >> >>> - For "time per each index" I'd add detailed log messages > stating > >> how > >> >> >> long did it take to build a particular index > >> >> >>> > >> >> >>> Thanks, > >> >> >>> Stan > >> >> >>> > >> >> >>>> On 26 Jun 2020, at 12:49, ткаленко кирилл < > tkalkir...@yandex.ru> > >> >> >> wrote: > >> >> >>>> > >> >> >>>> Hi, Igniters. > >> >> >>>> > >> >> >>>> I would like to know if it is possible to estimate how much the > >> index > >> >> >> rebuild will take? > >> >> >>>> > >> >> >>>> At the moment, I have found the following metrics [1] and [2] > and > >> >> >> since the rebuild is based on caches, I think it would be useful > to > >> know > >> >> >> how many records are processed in indexing. This way we can > >> estimate how > >> >> >> long we have to wait for the index to be rebuilt by subtracting > [3] > >> and > >> >> how > >> >> >> many records are indexed. > >> >> >>>> > >> >> >>>> I think we should add this metric [4]. > >> >> >>>> > >> >> >>>> Comments, suggestions? > >> >> >>>> > >> >> >>>> [1] - https://issues.apache.org/jira/browse/IGNITE-12184 > >> >> >>>> [2] - > >> >> >> > >> >> > >> > > org.apache.ignite.internal.processors.cache.CacheGroupMetricsImpl#idxBuildCntPartitionsLeft > >> >> >>>> [3] - org.apache.ignite.cache.CacheMetrics#getCacheSize > >> >> >>>> [4] - org.apache.ignite.cache.CacheMetrics#getNumberIndexedKeys > >> >> >> >