+1 On Mon, Jan 9, 2023 at 4:50 AM Enrico Olivelli <eolive...@gmail.com> wrote:
> I agree with the change. > I have never used those metrics > > Enrico > > > Il giorno lun 26 dic 2022 alle ore 10:29 Wenbing Shen > <oliver.shen...@gmail.com> ha scritto: > > > > Hi BookKeepers, I've changed the limitStatsLogging default value to true > > from false: > > BP-60 <https://github.com/apache/bookkeeper/issues/3718> > > > > Motivation > > > > We have an efficient online bookie cluster with hundreds of bookie nodes > > deployed on SSD disks. > > > > We separate the AutoRecovery cluster and the Bookie cluster for > independent > > deployment. > > > > I observed that our AutoRecovery cluster GC is very frequent. After > > investigation, I found that the limitStatsLogging of the bookkeeper > client > > PCBC is disabled by default, and a large number of channel monitoring > > indicators are generated. Due to the large number of bookie cluster > nodes, > > this metric data occupies a large amount of heap memory. > > > > A single StringWriter object occupies 16MB of memory, of which nearly 70 > > StringWriter objects are waiting for the next GC to be destroyed, > occupying > > 1GB+ heap memory. > > Proposal > > > > In my use, I haven't found any usefulness of these PCBC monitoring > metrics > > data, at least so far, I haven't used it effectively. > > > > If our AutoRecovery and Bookie cluster are mixed in one process, these > > large objects will affect the performance and stability of Bookie > cluster. > > > > Since I can't find the meaning of these metrics by default, I suggest to > > adjust the default value of limitStatsLogging to true. > > > > Everyone can choose to turn it on or off, but by default, it is difficult > > for users to find out what effect this parameter will have, so that when > > their cluster grows to hundreds or thousands, when they realize the > problem > > sometimes, it is necessary to restart hundreds to thousands of bookies > in a > > rolling manner. > > > > At the same time, I observed that in pulsar, various monitoring of the > > bookkeeper client is turned off by default, because they really affect > the > > performance of the pulsar service, which is enough to show that we should > > try to change it, especially some very redundant metrics created based on > > channels. > > Compatibility, Deprecation, and Migration PlanClients that rely on PCBC > > metrics monitoring need to pay attention to this upgrade, but this will > not > > affect the actual functions of the client, only the metrics data, and > users > > can choose to open it again. > > > > > > What do you think about it? > > > > Best. > > Wenbing > -- Andrey Yegorov