Hi Tom, Unfortunately, I've tried your advice, and I think that we're still in a CPU-bound situation even after following the re-indexing and re-vacuuming.
Fortunately, though, I've just learned about a poor-man's profiler under Linux named pstack, and it's telling me that the vacuum process is spending most of its time in the enough_space() function in vacuum.c (somewhere around line 2574 in the most recent version in CVS I believe). When I run pstack, I almost always see the execution stack like this: % pstack 12158 #0 0x0000000000527478 in enough_space () #1 0x0000000000528b60 in repair_frag () #2 0x000000000052b130 in full_vacuum_rel () #3 0x000000000052bab8 in vacuum_rel () #4 0x000000000052bf12 in vacuum () #5 0x00000000005d69b9 in PortalRunUtility () #6 0x00000000005d79d5 in PortalRunMulti () #7 0x00000000005d80e5 in PortalRun () #8 0x00000000005d3e0b in exec_simple_query () #9 0x00000000005d4a6b in PostgresMain () #10 0x00000000005abe0b in ServerLoop () #11 0x00000000005ac9bd in PostmasterMain () #12 0x000000000055ae7e in main () Every now and then, I can see a call to ReadBufferExtended from repair_frag(), so the CPU-intensive part does eventually find what it's looking for, but I'm hopeful that this could somehow be done more efficiently and this process could be limited by I/O only. So...I guess my question is this: is it a known issue that this code path can get very cpu-bound for large numbers of blob garbage? Looking at the code for enough_space(), it is almost hard to believe that a lot of time is being spent here, but perhaps it's just getting called a LOT from repair_frag(). I'm no stranger to hacking code if I have to....thanks! Sam On Fri, Jan 22, 2010 at 4:06 PM, PG User 2010 <pguser2...@gmail.com> wrote: > Hi Tom, > > As always, your insight is VERY helpful. We'll try your suggestions and > see if that helps things out... Thanks! > > Sam > > > On Fri, Jan 22, 2010 at 4:01 PM, Tom Lane <t...@sss.pgh.pa.us> wrote: > >> PG User 2010 <pguser2...@gmail.com> writes: >> > 1) is there any easy way to fiddle with the vacuum process so that it is >> not >> > CPU bound and doing very little I/O? Why would vacuum full be CPU bound >> > anyway??? >> >> The only part of VAC FULL that seems like it could be CPU-bound is index >> cleanup. If the table is sufficiently bloated with dead tuples, that >> could take awhile. It might be useful to try this: >> >> 1. REINDEX TABLE pg_largeobject; >> 2. VACUUM pg_largeobject; >> 3. VACUUM FULL pg_largeobject; >> >> I have never tried this in a serious bloat situation, but in principle >> I think it should improve matters. The idea is to get rid of as many dead >> index and heap entries as you can before letting VAC FULL loose on it, and >> also do as much of the work as possible with a less-than-exclusive lock. >> Don't forget that large maintenance_work_mem will help the first two >> steps, as long as you don't set it so high as to drive the machine into >> swapping. >> >> > 2) is it possible to interrupt VACUUM FULL, then re-start it later on >> and >> > have it pick up where it was working before? >> >> NO. Doing that will in fact make things worse --- a failed VAC FULL >> results in even more dead entries to be cleaned up. >> >> regards, tom lane >> > >