[ Re sending to Hackers as the earlier mail failed to deliver to Hackers mailing list]
On Mon, Jan 9, 2017 at 4:13 PM, Haribabu Kommi <kommi.harib...@gmail.com> wrote: > > > On Thu, Aug 25, 2016 at 2:46 PM, Haribabu Kommi <kommi.harib...@gmail.com> > wrote: > >> On Thu, Aug 25, 2016 at 6:57 AM, Robert Haas <robertmh...@gmail.com> >> wrote: >> > >> > Personally, my preferred solution is still to have a background worker >> > that samples the published wait events and rolls up statistics, but >> > I'm not sure I've convinced anyone else. It could report the number >> > of seconds since it detected a wait event other than the current one, >> > which is not precisely the same thing as tracking the length of the >> > current wait but it's pretty close. I don't know for sure what's best >> > here - I think some experimentation and dialog is needed. >> >> Yes, using of background worker can reduce the load of adding all the >> wait time calculations in the main backend. I can give a try by modifying >> direct calculation approach and background worker (may be >> pg_stat_collector) >> to find the wait time based on the stat messages that are received from >> main backend related to wait start and wait end. >> >> I am not sure with out getting any signal or message from main backend, >> how much accurate the data can be gathered from a background worker. > > Apologies to come back to an old thread. I tried of using "stats collector process" as a background worker to calculate the wait times for LWLocks instead of adding another background worker for proof of concept. Created two hash tables, one is to store the "LWLock stats" and another is to store the "Backend's information" with PID as a key. Whenever the Backend is waiting for an LWLock, it sends the message to "stats collector" with PID and wait_event_info of the lock. Once the stats collector receives the message, Adds that Backend entry to Hash table after getting the start time. Once the Backend ends the waiting for the Lock, it sends the signal to the "stats collector" and it gets the entry from Hash table and finds out the wait time and update this time to the corresponding LWLock entry in another Hash table. The LWLock wait stats are stored in the stats file for persistence. Currently no stats reset logic. This patch is helpful in creating a view to display wait times of all wait events that is discussed in [1]. Comments? [1] - https://www.postgresql.org/message-id/CAASwCXdvQgZ-ox_ SyYMF5TAJVH-_rW71vthZynS%3DEMeexN5Giw%40mail.gmail.com Regards, Hari Babu Fujitsu Australia