On Thu, 2013-01-17 at 14:53 -0800, Jeff Davis wrote: > Test plan: > > 1. Take current patch (without "skip VM check for small tables" > optimization mentioned above). > 2. Create 500 tables each about 1MB. > 3. VACUUM them all. > 4. Start 500 connections (one for each table) > 5. Time the running of a loop that executes a COUNT(*) on that > connection's table 100 times.
Done, with a few extra variables. Again, thanks to Nathan Boley for lending me the 64-core box. Test program attached. I did both 1MB tables and 1 tuple tables, but I ended up throwing out the 1-tuple table results. First of all, as I said, that's a pretty easy problem to solve, so not really what I want to test. Second, I had to do so many iterations that I don't think I was testing anything useful. I did see what might have been a couple differences, but I would need to explore in more detail and I don't think it's worth it, so I'm just reporting on the 1MB tables. For each test, each of 500 connections runs 10 iterations of a COUNT(*) on it's own 1MB table (which is vacuumed and has the VM bit set). The query is prepared once. The table has only an int column. The variable is shared_buffers, going from 32MB (near exhaustion for 500 connections) to 2048MB (everything fits). The last column is the time range in seconds. I included the range this time, because there was more variance in the runs but I still think they are good test results. master: 32MB: 16.4 - 18.9 64MB: 16.9 - 17.3 128MB: 17.5 - 17.9 256MB: 14.7 - 15.8 384MB: 8.1 - 9.3 448MB: 4.3 - 9.2 512MB: 1.7 - 2.2 576MB: 0.6 - 0.6 1024MB: 0.6 - 0.6 2048MB: 0.6 - 0.6 patch: 32MB: 16.8 - 17.6 64MB: 17.1 - 17.5 128MB: 17.2 - 18.0 256MB: 14.8 - 16.2 384MB: 8.0 - 10.1 448MB: 4.6 - 7.2 512MB: 2.0 - 2.6 576MB: 0.6 - 0.6 1024MB: 0.6 - 0.6 2048MB: 0.6 - 0.6 Conclusion: I see about what I expect: a precipitous drop in runtime after everything fits in shared_buffers (500 1MB tables means the inflection point around 512MB makes a lot of sense). There does seem to be a measurable difference right around that inflection point, but it's not much. Considering that this is the worst case that I could devise, I am not too concerned about this. However, it is interesting to see that there really is a lot of maintenance work being done when we need to move pages in and out of shared buffers. I'm not sure that it's related to the freelists though. For the extra pins to really be a problem, I think a much higher percentage of the buffers would need to be pinned. Since the case we are worried about involves scans (if it involved indexes, that would already be using more than one pin per scan), then that means the only way to get to a high percentage of pinned buffers is by having very small tables. But we don't really need to use the VM when scanning very small tables (the overhead would be elsewhere), so I think we're OK. So, I attached a new version of the patch that doesn't look at the VM for tables with fewer than 32 pages. That's the only change. Regards, Jeff Davis
#include <libpq-fe.h> #include <stdlib.h> #include <stdio.h> #include <sys/time.h> #define QSIZE 256 void test(char *query, int procnum, int niter) { PGconn *conn; PGresult *result; int i; conn = PQconnectdb("host=/tmp dbname=postgres"); if (PQstatus(conn) != CONNECTION_OK) { fprintf(stderr, "connection failed!\n"); exit(1); } result = PQprepare(conn, "q", query, 0, NULL); if (PQresultStatus(result) != PGRES_COMMAND_OK) { fprintf(stderr, "PREPARE failed: %s", PQerrorMessage(conn)); PQclear(result); exit(1); } PQclear(result); for (i = 0; i < niter; i++) { result = PQexecPrepared(conn, "q", 0, NULL, NULL, NULL, 0); if (PQresultStatus(result) != PGRES_TUPLES_OK) { fprintf(stderr, "EXECUTE PREPARED failed: %s\n", PQerrorMessage(conn)); PQclear(result); exit(1); } PQclear(result); } PQfinish(conn); } int main(int argc, char *argv[]) { int niter; int nprocs; char query[QSIZE]; int i; pid_t *procs; struct timeval tv1, tv2; if (argc != 3) { fprintf(stderr, "expected 3 arguments, got %d\n", argc); exit(1); } nprocs = atoi(argv[1]); niter = atoi(argv[2]); procs = malloc(sizeof(pid_t) * nprocs); gettimeofday(&tv1, NULL); for (i = 0; i < nprocs; i++) { pid_t pid = fork(); if (pid == 0) { snprintf(query, QSIZE, "SELECT COUNT(*) FROM mb_%d;", i); test(query, i, niter); exit(0); } else { procs[i] = pid; } } for (i = 0; i < nprocs; i++) { int status; waitpid(procs[i], &status, 0); if (!WIFEXITED(status)) { fprintf(stderr, "child did not exit!\n", argc); exit(1); } if (WEXITSTATUS(status) != 0) { fprintf(stderr, "child exited with status %d\n", WEXITSTATUS(status)); exit(1); } } gettimeofday(&tv2, NULL); free(procs); if (tv2.tv_usec < tv1.tv_usec) { tv2.tv_usec += 1000000; tv2.tv_sec--; } printf("%03d.%06d\n", (int) (tv2.tv_sec - tv1.tv_sec), (int) (tv2.tv_usec - tv1.tv_usec)); }
rm-pd-all-visible-20130118.patch.gz
Description: GNU Zip compressed data
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers