On Sun, Apr 23, 2017 at 12:41 PM, Robert Haas <robertmh...@gmail.com> wrote: >> That's after inlining the compare on both the linear and sequential >> code, and it seems it lets the compiler optimize the binary search to >> the point where it outperforms the sequential search. >> >> That's not the case when the compare isn't inlined. >> >> That seems in line with [1], that show the impact of various >> optimizations on both algorithms. It's clearly a close enough race >> that optimizations play a huge role. >> >> Since we're not likely to go and implement SSE2-optimized versions, I >> believe I'll leave the binary search only. That's the attached patch >> set. > > That sounds reasonable based on your test results. I guess part of > what I was wondering is whether a vacuum on a table large enough to > require multiple gigabytes of work_mem isn't likely to be I/O-bound > anyway. If so, a few cycles one way or the other other isn't likely > to matter much. If not, where exactly are all of those CPU cycles > going?
I haven't been able to produce a table large enough to get a CPU-bound vacuum, so such a case is likely to require huge storage and a very powerful I/O system. Mine can only get about 100MB/s tops, and at that speed, vacuum is I/O bound even for multi-GB work_mem. That's why I've been using the reported CPU time as benchmark. BTW, I left the benchmark script running all weekend at the office, and when I got back a power outage had aborted it. In a few days I'll be out on vacation, so I'm not sure I'll get the benchmark results anytime soon. But this patch moved to 11.0 I guess there's no rush. Just FTR, in case I leave before the script is done, the script got to scale 400 before the outage: INFO: vacuuming "public.pgbench_accounts" INFO: scanned index "pgbench_accounts_pkey" to remove 40000000 row versions DETAIL: CPU: user: 5.94 s, system: 1.26 s, elapsed: 26.77 s. INFO: "pgbench_accounts": removed 40000000 row versions in 655739 pages DETAIL: CPU: user: 3.36 s, system: 2.57 s, elapsed: 61.67 s. INFO: index "pgbench_accounts_pkey" now contains 0 row versions in 109679 pages DETAIL: 40000000 index row versions were removed. 109289 index pages have been deleted, 0 are currently reusable. CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.06 s. INFO: "pgbench_accounts": found 38925546 removable, 0 nonremovable row versions in 655738 out of 655738 pages DETAIL: 0 dead row versions cannot be removed yet, oldest xmin: 1098 There were 0 unused item pointers. Skipped 0 pages due to buffer pins, 0 frozen pages. 0 pages are entirely empty. CPU: user: 15.34 s, system: 6.95 s, elapsed: 126.21 s. INFO: "pgbench_accounts": truncated 655738 to 0 pages DETAIL: CPU: user: 0.22 s, system: 2.10 s, elapsed: 8.10 s. In summary: binsrch v10: s100: CPU: user: 3.02 s, system: 1.51 s, elapsed: 16.43 s. s400: CPU: user: 15.34 s, system: 6.95 s, elapsed: 126.21 s. The old results: Old Patched (sequential search): s100: CPU: user: 3.21 s, system: 1.54 s, elapsed: 18.95 s. s400: CPU: user: 14.03 s, system: 6.35 s, elapsed: 107.71 s. s4000: CPU: user: 228.17 s, system: 108.33 s, elapsed: 3017.30 s. Unpatched: s100: CPU: user: 3.39 s, system: 1.64 s, elapsed: 18.67 s. s400: CPU: user: 15.39 s, system: 7.03 s, elapsed: 114.91 s. s4000: CPU: user: 282.21 s, system: 105.95 s, elapsed: 3017.28 s. I wouldn't fret over the slight slowdown vs the old patch, it could be noise (the script only completed a single run at scale 400). -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers