Re: [PERFORM] Postgres Replaying WAL slowly

2014-09-17 Thread Simon Riggs
On 1 July 2014 20:20, Tom Lane wrote: > Jeff Frost writes: >>> On Jun 30, 2014, at 4:04 PM, Tom Lane wrote: Did you check whether the locks were all on temp tables of the ON COMMIT DROP persuasion? > >> And indeed it did catch up overnight and the lag increased shortly after a >> corr

Re: [PERFORM] Postgres Replaying WAL slowly

2014-07-03 Thread Simon Riggs
On 1 July 2014 20:20, Tom Lane wrote: > I don't plan to work on this myself, but perhaps someone with more > motivation will want to run with these ideas. I was planning to work on improving performance of replication apply over the summer, mid July - Aug, so I'll add this to the list. -- Sim

Re: [PERFORM] Postgres Replaying WAL slowly

2014-07-02 Thread Tom Lane
Andres Freund writes: > On 2014-07-01 15:20:37 -0400, Tom Lane wrote: >> It seems like there are three, not mutually exclusive, ways we might >> address this: > But I think more importantly it's probably quite possible to hit a > similar problem without ON COMMIT DROP relations. Say DISCARD TEMP

Re: [PERFORM] Postgres Replaying WAL slowly

2014-07-02 Thread Andres Freund
On 2014-07-01 15:20:37 -0400, Tom Lane wrote: > Jeff Frost writes: > >> On Jun 30, 2014, at 4:04 PM, Tom Lane wrote: > >>> Did you check whether the locks were all on temp tables of the > >>> ON COMMIT DROP persuasion? > > > And indeed it did catch up overnight and the lag increased shortly afte

Re: [PERFORM] Postgres Replaying WAL slowly

2014-07-01 Thread Tom Lane
Jeff Frost writes: >> On Jun 30, 2014, at 4:04 PM, Tom Lane wrote: >>> Did you check whether the locks were all on temp tables of the >>> ON COMMIT DROP persuasion? > And indeed it did catch up overnight and the lag increased shortly after a > correlating spike in AccessExclusiveLocks that were

Re: [PERFORM] Postgres Replaying WAL slowly

2014-07-01 Thread Jeff Frost
On Jun 30, 2014, at 4:57 PM, Jeff Frost wrote: > > On Jun 30, 2014, at 4:04 PM, Tom Lane wrote: > >> Ah ... that's more like a number I can believe something would have >> trouble coping with. Did you see a noticeable slowdown with this? >> Now that we've seen that number, of course it's poss

Re: [PERFORM] Postgres Replaying WAL slowly

2014-06-30 Thread Jeff Frost
On Jun 30, 2014, at 4:04 PM, Tom Lane wrote: > Ah ... that's more like a number I can believe something would have > trouble coping with. Did you see a noticeable slowdown with this? > Now that we've seen that number, of course it's possible there was an > even higher peak occurring when you sa

Re: [PERFORM] Postgres Replaying WAL slowly

2014-06-30 Thread Andres Freund
On 2014-06-30 19:04:20 -0400, Tom Lane wrote: > Jeff Frost writes: > >>> So it seems like we have a candidate explanation. I'm a bit surprised > >>> that StandbyReleaseLocks would get this slow if there are only a dozen > >>> AccessExclusiveLocks in place at any one time, though. Perhaps that >

Re: [PERFORM] Postgres Replaying WAL slowly

2014-06-30 Thread Tom Lane
Jeff Frost writes: >>> So it seems like we have a candidate explanation. I'm a bit surprised >>> that StandbyReleaseLocks would get this slow if there are only a dozen >>> AccessExclusiveLocks in place at any one time, though. Perhaps that >>> was a low point and there are often many more? > Si

Re: [PERFORM] Postgres Replaying WAL slowly

2014-06-30 Thread Jeff Frost
On Jun 30, 2014, at 1:46 PM, Jeff Frost wrote: >> So it seems like we have a candidate explanation. I'm a bit surprised >> that StandbyReleaseLocks would get this slow if there are only a dozen >> AccessExclusiveLocks in place at any one time, though. Perhaps that >> was a low point and there

Re: [PERFORM] Postgres Replaying WAL slowly

2014-06-30 Thread Jeff Frost
On Jun 30, 2014, at 1:39 PM, Tom Lane wrote: > > >> Another item of note is the system catalogs are quite bloated: >> Would that cause the replica to spin on StandbyReleaseLocks? > > AFAIK, no. It's an unsurprising consequence of heavy use of short-lived > temp tables though. > Yah, this h

Re: [PERFORM] Postgres Replaying WAL slowly

2014-06-30 Thread Tom Lane
Jeff Frost writes: > On Jun 30, 2014, at 1:15 PM, Andres Freund wrote: >> So these are probably relations created in uncommitted >> transactions. Possibly ON COMMIT DROP temp tables? > That would make sense. There are definitely quite a few of those being used. Uh-huh. I doubt that the mechan

Re: [PERFORM] Postgres Replaying WAL slowly

2014-06-30 Thread Jeff Frost
On Jun 30, 2014, at 1:15 PM, Andres Freund wrote: > On 2014-06-30 12:57:56 -0700, Jeff Frost wrote: >> >> On Jun 30, 2014, at 12:54 PM, Matheus de Oliveira >> wrote: >> >>> >>> On Mon, Jun 30, 2014 at 4:42 PM, Jeff Frost wrote: >>> And if you go fishing in pg_class for any of the oids, you

Re: [PERFORM] Postgres Replaying WAL slowly

2014-06-30 Thread Andres Freund
On 2014-06-30 12:57:56 -0700, Jeff Frost wrote: > > On Jun 30, 2014, at 12:54 PM, Matheus de Oliveira > wrote: > > > > > On Mon, Jun 30, 2014 at 4:42 PM, Jeff Frost wrote: > > And if you go fishing in pg_class for any of the oids, you don't find > > anything: > > > > That is probably becaus

Re: [PERFORM] Postgres Replaying WAL slowly

2014-06-30 Thread Jeff Frost
On Jun 30, 2014, at 12:54 PM, Matheus de Oliveira wrote: > > On Mon, Jun 30, 2014 at 4:42 PM, Jeff Frost wrote: > And if you go fishing in pg_class for any of the oids, you don't find > anything: > > That is probably because you are connected in the wrong database. Once you > connect to th

Re: [PERFORM] Postgres Replaying WAL slowly

2014-06-30 Thread Matheus de Oliveira
On Mon, Jun 30, 2014 at 4:42 PM, Jeff Frost wrote: > And if you go fishing in pg_class for any of the oids, you don't find > anything: That is probably because you are connected in the wrong database. Once you connect to the database of interest, you don't even need to query pg_class, just cast

Re: [PERFORM] Postgres Replaying WAL slowly

2014-06-30 Thread Jeff Frost
On Jun 30, 2014, at 12:32 PM, Tom Lane wrote: > Jeff Frost writes: >> Sampling pg_locks on the primary shows ~50 locks with ExclusiveLock mode: > >> mode | count >> --+--- >> AccessExclusiveLock |11 >> AccessShareLock | 2089 >>

Re: [PERFORM] Postgres Replaying WAL slowly

2014-06-30 Thread Tom Lane
Jeff Frost writes: > Sampling pg_locks on the primary shows ~50 locks with ExclusiveLock mode: >mode | count > --+--- > AccessExclusiveLock |11 > AccessShareLock | 2089 > ExclusiveLock|46 > RowExclusiveLock

Re: [PERFORM] Postgres Replaying WAL slowly

2014-06-30 Thread Jeff Frost
On Jun 30, 2014, at 12:17 PM, Jeff Frost wrote: >> >> already is quite helpful. >> >> What are you doing on that system? Is there anything requiring large >> amounts of access exclusive locks on the primary? Possibly large amounts >> of temporary relations? > > > The last time we did a 100% l

Re: [PERFORM] Postgres Replaying WAL slowly

2014-06-30 Thread Jeff Frost
On Jun 30, 2014, at 11:39 AM, Andres Freund wrote: > On 2014-06-30 11:34:52 -0700, Jeff Frost wrote: >> On Jun 30, 2014, at 10:29 AM, Soni M wrote: > >>> It is >>> 96.62% postgres [.] StandbyReleaseLocks >>> as Jeff said. It runs quite long time, more than 5 minutes i think >>> >

Re: [PERFORM] Postgres Replaying WAL slowly

2014-06-30 Thread Pavel Stehule
2014-06-30 20:34 GMT+02:00 Jeff Frost : > On Jun 30, 2014, at 10:29 AM, Soni M wrote: > > > > > On Tue, Jul 1, 2014 at 12:14 AM, Andres Freund > wrote: > >> >> My guess it's a spinlock, probably xlogctl->info_lck via >> RecoveryInProgress(). Unfortunately inline assembler doesn't always seem >>

Re: [PERFORM] Postgres Replaying WAL slowly

2014-06-30 Thread Andres Freund
On 2014-06-30 11:34:52 -0700, Jeff Frost wrote: > On Jun 30, 2014, at 10:29 AM, Soni M wrote: > > It is > > 96.62% postgres [.] StandbyReleaseLocks > > as Jeff said. It runs quite long time, more than 5 minutes i think > > > > i also use hot standby. we have 4 streaming replica,

Re: [PERFORM] Postgres Replaying WAL slowly

2014-06-30 Thread Jeff Frost
On Jun 30, 2014, at 10:29 AM, Soni M wrote: > > > > On Tue, Jul 1, 2014 at 12:14 AM, Andres Freund wrote: > > My guess it's a spinlock, probably xlogctl->info_lck via > RecoveryInProgress(). Unfortunately inline assembler doesn't always seem > to show up correctly in profiles... > > What wo

Re: [PERFORM] Postgres Replaying WAL slowly

2014-06-30 Thread Soni M
On Tue, Jul 1, 2014 at 12:14 AM, Andres Freund wrote: > > My guess it's a spinlock, probably xlogctl->info_lck via > RecoveryInProgress(). Unfortunately inline assembler doesn't always seem > to show up correctly in profiles... > > What worked for me was to build with -fno-omit-frame-pointer - th

Re: [PERFORM] Postgres Replaying WAL slowly

2014-06-30 Thread Andres Freund
On 2014-06-30 19:14:24 +0300, Heikki Linnakangas wrote: > On 06/30/2014 05:46 PM, Soni M wrote: > >Here's what 'perf top' said on streaming replica : > > > >Samples: 26K of event 'cpu-clock', Event count (approx.): 19781 > > 95.97% postgres [.] 0x002210f3 > >

Re: [PERFORM] Postgres Replaying WAL slowly

2014-06-30 Thread Jeff Frost
On Jun 30, 2014, at 9:14 AM, Heikki Linnakangas wrote: > On 06/30/2014 05:46 PM, Soni M wrote: >> Here's what 'perf top' said on streaming replica : >> >> Samples: 26K of event 'cpu-clock', Event count (approx.): 19781 >> 95.97% postgres [.] 0x002210f3 > >

Re: [PERFORM] Postgres Replaying WAL slowly

2014-06-30 Thread Heikki Linnakangas
On 06/30/2014 05:46 PM, Soni M wrote: Here's what 'perf top' said on streaming replica : Samples: 26K of event 'cpu-clock', Event count (approx.): 19781 95.97% postgres [.] 0x002210f3 Ok, so it's stuck doing something.. Can you get build with debug symbol

Re: [PERFORM] Postgres Replaying WAL slowly

2014-06-30 Thread Soni M
Here's what 'perf top' said on streaming replica : Samples: 26K of event 'cpu-clock', Event count (approx.): 19781 95.97% postgres [.] 0x002210f3 0.41% perf [.] 0x0005f225 0.39% libc-2.12.so

Re: [PERFORM] Postgres Replaying WAL slowly

2014-06-30 Thread Heikki Linnakangas
On 06/29/2014 03:43 PM, Soni M wrote: top and sar says 100% cpu usage of one core, no sign of I/O wait. Hmm, I wonder what it's doing then... If you have "perf" installed on the system, you can do "perf top" to get a quick overlook of where the CPU time is spent. - Heikki -- Sent via pgs

Re: [PERFORM] Postgres Replaying WAL slowly

2014-06-29 Thread Soni M
Here's some data from standby (this is when replaying take 44seconds per wal files): 2014-06-29 00:07:36.513 CDT,,,16682,,53af6f46.412a,44,,2014-06-28 20:43:34 CDT,,0,LOG,0,"restartpoint complete: wrote 63187 buffers (3.0%); 0 transaction log file(s) added, 0 removed, 0 recycled; write=209.170

Re: [PERFORM] Postgres Replaying WAL slowly

2014-06-29 Thread Soni M
top and sar says 100% cpu usage of one core, no sign of I/O wait. The database is 1.5TB in size. RAM in master is 145GB, on slave it's differ, some has about 16GB another has 145GB also. nothing suspicious on standby's postgres log. on master's postgres log : WARNING,01000,"pgstat wait timeout",,

Re: [PERFORM] Postgres Replaying WAL slowly

2014-06-29 Thread Heikki Linnakangas
On 06/29/2014 11:14 AM, Soni M wrote: Everything works fine until on Thursday we have high load on master, and after that every streaming replica lag further behind the master. Even on night and weekend where all server load is low. But the slony slave is OK at all. What does 'top' on the stand