Hi,

It's the same logic with any polling system. An integration calculation using 
monte-carlo method with only a few points won't be accurate enough and can even 
be completely wrong etc.
Polling is OK to troubleshoot a problem on the fly but 2 points are not enough. 
A few seconds are needed to obtain good enough data, e.g 5-10 seconds of 
polling with a 0.1=>0.01s interval between 2 queries of the activity.
Polling a few seconds while the user is waiting is normally enough to say if a 
significant part of the waits are on the database. It's very important to know 
that. With 1 hour of accumulated statistics, a DBA will always see something to 
fix. But if the user waits 10 seconds on a particular screen and 1 second is 
spent on the database it often won't directly help.
Polling gives great information with postgreSQL 10 but it was already useful to 
catch top queries etc. in older versions.
I always check if activity is adequately reported by my tool using known cases. 
I want to be sure it will report adequately things in real-world 
troubleshooting sessions. Sometimes there are bugs in my tool, once there was 
an issue with postgres (pgstat_report_activty() was not called by workers in 
parallel index creation)

Best regards
Phil

De : Michael Paquier <mich...@paquier.xyz>
Envoyé : jeudi 4 octobre 2018 12:58
À : Phil Florent
Cc : Yotsunaga, Naoki; Tomas Vondra; pgsql-hackers@lists.postgresql.org
Objet : Re: [Proposal] Add accumulated statistics for wait event

On Thu, Oct 04, 2018 at 09:32:37AM +0000, Phil Florent wrote:
> I am a DB beginner, so please tell me. It says that you can find
> events that are bottlenecks in sampling, but as you saw above, you can
> not find events shorter than the sampling interval, right?

Yes, which is why it would be as simple as making the interval shorter,
still not too short so as it bloats the amount of information fetched
which needs to be stored and afterwards (perhaps) treated for analysis.
This gets rather close to signal processing.  A simple image is for
example, assuming that event A happens 100 times in an interval of 1s,
and event B only once in the same interval of 1s, then if the snapshot
interval is only 1s, then in the worst case A would be treated an equal
of B, which would be wrong.
--
Michael

Reply via email to