Hello again, now I have restarted the slave with shared_buffers set to 32GB, and now the recovery process is at 100% CPU:
$ strace -c -f -p 27076 Process 27076 attached - interrupt to quit Process 27076 detached % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 67.65 0.002127 0 18127 lseek 28.98 0.000911 0 16496 getppid 2.58 0.000081 0 758 read 0.80 0.000025 4 6 1 open 0.00 0.000000 0 4 close 0.00 0.000000 0 1 stat 0.00 0.000000 0 493 493 unlink ------ ----------- ----------- --------- --------- ---------------- 100.00 0.003144 35885 494 total $ strace -c -f -p 27076 Process 27076 attached - interrupt to quit Process 27076 detached % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 44.44 0.000240 0 5976 lseek 38.15 0.000206 0 5522 getppid 17.41 0.000094 1 172 read 0.00 0.000000 0 4 1 open 0.00 0.000000 0 2 close 0.00 0.000000 0 1 stat 0.00 0.000000 0 159 159 unlink ------ ----------- ----------- --------- --------- ---------------- 100.00 0.000540 11836 160 total Then restarted again with shared_buffers set back to 2GB: now the replay process is relatively fast again and here is the strace: $ strace -c -f -p 36450 Process 36450 attached - interrupt to quit Process 39336 attached Process 39337 attached Process 39337 detached Process 39338 attached Process 39338 detached Process 39339 attached Process 39339 detached Process 39336 detached Process 39359 attached Process 39360 attached Process 39360 detached Process 39361 attached Process 39361 detached Process 39362 attached Process 39362 detached Process 39359 detached Process 40168 attached Process 40169 attached Process 40169 detached Process 40170 attached Process 40170 detached Process 40171 attached Process 40171 detached Process 40168 detached Process 36450 detached % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 88.21 0.668159 31817 21 9 wait4 4.69 0.035553 6 5970 read 1.92 0.014572 0 120565 lseek 1.91 0.014474 5 2957 2954 unlink 1.55 0.011766 8 1545 write 1.27 0.009582 0 109274 getppid 0.31 0.002340 3 762 fcntl 0.03 0.000245 20 12 clone 0.02 0.000163 1 219 rt_sigprocmask 0.02 0.000162 1 154 10 open 0.01 0.000092 0 216 mmap 0.01 0.000084 1 133 42 stat 0.01 0.000066 6 12 getrlimit 0.01 0.000046 0 178 close 0.00 0.000037 1 27 munmap 0.00 0.000035 6 6 pipe 0.00 0.000035 12 3 set_tid_address 0.00 0.000030 1 42 getegid 0.00 0.000021 0 183 rt_sigaction 0.00 0.000000 0 108 fstat 0.00 0.000000 0 102 mprotect 0.00 0.000000 0 36 brk 0.00 0.000000 0 9 rt_sigreturn 0.00 0.000000 0 9 9 ioctl 0.00 0.000000 0 48 12 access 0.00 0.000000 0 6 dup2 0.00 0.000000 0 6 getpid 0.00 0.000000 0 12 execve 0.00 0.000000 0 9 uname 0.00 0.000000 0 42 getuid 0.00 0.000000 0 42 getgid 0.00 0.000000 0 45 geteuid 0.00 0.000000 0 6 getpgrp 0.00 0.000000 0 6 statfs 0.00 0.000000 0 12 arch_prctl 0.00 0.000000 0 6 3 futex 0.00 0.000000 0 3 set_robust_list ------ ----------- ----------- --------- --------- ---------------- 100.00 0.757462 242786 3039 total With best regards, -- Valentine Gogichashvili On Thu, Aug 16, 2012 at 4:53 PM, Valentine Gogichashvili <val...@gmail.com>wrote: > Hello Andreas, > > here is the process, that now actually is not using CPU at all and the > shared_buffers are set to 2GB: > > 50978 postgres 20 0 2288m 2.0g 2.0g S 0.0 1.6 4225:34 postgres: > startup process recovering 000000050000262E000000FD > > It is hanging on that file for several minutes now. > > and here is the strace: > > $ strace -c -f -p 50978 > Process 50978 attached - interrupt to quit > Process 50978 detached > % time seconds usecs/call calls errors syscall > ------ ----------- ----------- --------- --------- ---------------- > 94.82 0.007999 37 215 select > 2.73 0.000230 1 215 getppid > 2.45 0.000207 1 215 215 stat > ------ ----------- ----------- --------- --------- ---------------- > 100.00 0.008436 645 215 total > > What kind of additional profiling information would you like to see? > > Regards, > > -- Valentin > > > >