On Tue, Jan 7, 2025 at 10:44 AM Bertrand Drouvot <bertranddrouvot...@gmail.com> wrote: > > Hi, > > On Wed, Dec 25, 2024 at 06:25:50PM +0100, Tomas Vondra wrote: > > Hi, > > > > On 12/23/24 07:35, wenhui qiu wrote: > > > Hi Tomas > > > This is a great feature. > > > + /* > > > + * Define (or redefine) custom GUC variables. > > > + */ > > > + DefineCustomIntVariable("stats_history.size", > > > + "Sets the amount of memory available for past events.", > > > + NULL, > > > + &statsHistorySizeMB, > > > + 1, > > > + 1, > > > + 128, > > > + PGC_POSTMASTER, > > > + GUC_UNIT_MB, > > > + NULL, > > > + NULL, > > > + NULL); > > > + > > > RAM is in terabytes now, the statsHistorySize is 128MB ,I think can > > > increase to store more history record ? > > > > > > > Maybe, the 128MB is an arbitrary (and conservative) limit - it's enough > > for ~500k events, which seems good enough for most systems. Of course, > > on systems with many relations might need more space, not sure. > > > > I was thinking about specifying the space in more natural terms, either > > as amount of time ("keep 1 day of history") or number of entries ("10k > > entries"). That would probably mean the memory can't be allocated as > > fixed size. > > > > But maybe it'd be possible to just write the entries to a file. We don't > > need random access to past entries (unlike e.g. pg_stat_statements), and > > people won't query that very often either. > > Thanks for working on this! > > Another idea regarding the storage of those metrics: I think that one would > want to see "precise" data for recent metrics but would probably be fine with > some > level of aggregation for historical ones. Something like being able to > retrieve > "1 day of raw data" and say one year of data aggregated by day (average, > maximum, > minimum , standard deviation and maybe some percentiles) could be fine too. >
While I'm sure some people are ok with it, I would say that most of the observability/metrics community has moved away from aggregated data storage towards raw time series data in tools like prometheus, tsdb, and timescale in order to avoid the problems that misleading / lossy / low-resolution data can create. Robert Treat https://xzilla.net