Hi,

Is DBA really able to solve bottlenecks with sampling?

What I would like to say is that if we have information on the number of wait 
events and the wait time(like other DB), we can investigate more easily.

Yes you will be able to solve bottlenecks with sampling. In interactive mode, a 
1s interval is probably too large. I use 0s1 - 0s01 with my tool and it is 
normally OK. In batch mode I use 1s=>10s. If you want to visualize the results 
it's easy to use a dedicated tool and bottlenecks will clearly appear .
Since grafana is now able to connect directly to a postgresql source, I use it 
to display the information collected from pg_stat_activity and psutil ( e.g 
https://pgphil.ovh/traqueur_dashboard_02.php page is written in french but 
panels are in english)

Other DB have accumulated statistics but you can notice that sampling is also 
their most modern method.
E.g Oracle DB : 20 years ago you already had tools like "utlbstat/utlestat" . 
Then you had "statspack". Those tools were based on accumulated statistics and 
the reports were based on differences between 2 points. It was useful to solve 
major problems but it was limited and not precise enough in many cases.

The preferred feature to identify bottlenecks in the Oracle world is now ASH 
(active session history). It can help with major problems, specific problems 
AND it can identify short blockages.
Too bad it is licensed as an option of their Enterprise Edition but similar 
tools exist and they are also based on sampling of the activity.

With the "official" ASH, sampling and archiving are done internally and you 
have a circular memory zone dedicated to the feature. Hence the overhead is 
lower but that's all.

The most advanced interactive tool is called "snapper" and it is also based on 
sampling.

Best regards
Phil


________________________________
De : Yotsunaga, Naoki <yotsunaga.na...@jp.fujitsu.com>
Envoyé : lundi 29 octobre 2018 02:20
À : 'Phil Florent'; 'Michael Paquier'
Cc : 'Tomas Vondra'; 'pgsql-hackers@lists.postgresql.org'
Objet : RE: [Proposal] Add accumulated statistics for wait event


On Thu, Oct 4, 2018 at 8:22 PM, Yotsunaga Naoki wrote:



Hi, I understood and thought of your statistic comment once again. In the case 
of sampling, is there enough statistic to investigate?

In the case of a long SQL, I think that it is possible to obtain a sufficient 
sampling number.



However, in the case of about 1 minute of SQL, only 60 samples can be obtained 
at most.

#Because legard’s comment.

https://www.postgresql.org/message-id/1539158356795-0.post%40n3.nabble.com



Does this sampling number of 60 give information that I really want?

Perhaps it is not to miss the real problem part?

---------------------------------------

Naoki, Yotsunaga.

Reply via email to