The following bug has been logged on the website: Bug reference: 6342 Logged by: Andrea Grassi Email address: andreagra...@sogeasoft.com PostgreSQL version: 8.4.8 Operating system: SUSE SLES 10 SP4 64 BIT Description:
Hi, I have a big and strange problem. Sometimes, libpq remains blocked in “poll” function even if the server has already answered to the query. If I attach to the process using kdbg I found this stack: __kernel_vsyscall() poll() from /lib/libc.so.6 pqSocketCheck() from /home/pg/pgsql/lib-32/libpq.so.5 pqWaitTimed() from /home/pg/pgsql/lib-32/libpq.so.5 pqWait() from /home/pg/pgsql/lib-32/libpq.so.5 PQgetResult() from /home/pg/pgsql/lib-32/libpq.so.5 PQexecFinish() from /home/pg/pgsql/lib-32/libpq.so.5 … To simplify the context and to reproduce the bug, I wrote a test program (that I attach below) that uses only libpq interface (no other strange libraries) to read my database at localhost. It loop on a table of 64000 rows and for each row it reads another table. Generally it take 1 minute to work. I put this program in a loop, so once it finishes, it restarts. Usually it works fine but sometimes (without any rule) it blocks. It blocks always (with the stack above) executing PQexec function (“CLOSE CURSOR xx” or “FETCH ALL IN xx”). If I press “continue” on kdbg after attaching the process, the programs continue its execution and exit with success. Here the specifics of the platform (a SLES 10 SP4 64-bit WITHOUT any VMWARE) Server HP DL 580 G7 4 CPU INTEL XEON X7550 64 GB RAM 8 HD 600GB SAS DP 6G 2,5” RAID 1 e RAID5 S.O. SUSE SLES 10 SP4 64 BIT Kernel Linux linuxspanesi 2.6.16.60-0.85.1-smp #1 SMP Thu Mar 17 11:45:06 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux Server Postgres 8.4.8 - 64-bit Libpq 8.4.8 – 32-bit I try to recompile libpq in - debug mode - on a 64-bit machine with –m32 option - on a 32-bit machine - setting HAVE_POLL to false at line 1053 in fe-misc.c to force to execute the other branch of “#ifdef/else” using the function “select()” instead of “poll()” but none fixes the bug. I had the same stack as above, except for the last case in which I had “___newselect_nocancel()” instead of “poll()”. If I check the state of the connection using the “netstat” command I get this output: tcp 24 0 127.0.0.1:49007 127.0.0.1:5432 ESTABLISHED 17415/pq_example.e where the second field (recv-Q) is always blocked to a non-zero value. It seems as the server has already answered but the libpq or poll function don’t realize it. Consider that the machine is very good and very fast. It seems that the answer of the server arrives before the libpq starts waiting for it (calling poll). Could be ? I try to install a VMware this the same version of Linux and same version of the kernel on a machine much less powerful: my program works fine and never blocks. Here below the code of the example program: /* * testlibpq.c * * Test the C version of libpq, the PostgreSQL frontend library. */ #include <stdio.h> #include <stdlib.h> #include <string.h> #include "libpq-fe.h" static void exit_nicely(PGconn *conn) { PQfinish(conn); exit(1); } int main(int argc, char **argv) { const char *conninfo; PGconn *conn; PGresult *res; int i, j; /* * If the user supplies a parameter on the command line, use it as the * conninfo string; otherwise default to setting dbname=postgres and using * environment variables or defaults for all other connection parameters. */ /* Make a connection to the database */ #ifdef CASE1 conn = PQsetdbLogin( getenv("SQLSERVER"), // pghost 0, // pgport 0, // pgoptions 0, // pgtty "OSA", // dbName 0, // login 0 // pwd ); #else conn = PQconnectdb("dbname = OSA"); #endif /* Check to see that the backend connection was successfully made */ if (PQstatus(conn) != CONNECTION_OK) { fprintf(stderr, "Connection to database failed: %s", PQerrorMessage(conn)); exit_nicely(conn); } res = PQexec (conn, "SET datestyle='ISO'"); switch (PQresultStatus (res)) { case PGRES_BAD_RESPONSE: case PGRES_NONFATAL_ERROR: case PGRES_FATAL_ERROR: fprintf(stderr, "SET DATESTYLE command failed: %s", PQresultErrorMessage(res)); break; } PQclear(res); /* * Our test case here involves using a cursor, for which we must be inside * a transaction block. We could do the whole thing with a single * PQexec() of "select * from pg_database", but that's too trivial to make * a good example. */ /* Start a transaction block */ res = PQexec(conn, "BEGIN"); if (PQresultStatus(res) != PGRES_COMMAND_OK) { fprintf(stderr, "BEGIN command failed: %s", PQerrorMessage(conn)); PQclear(res); exit_nicely(conn); } /* * Should PQclear PGresult whenever it is no longer needed to avoid memory * leaks */ PQclear(res); /* * Fetch rows from pg_database, the system catalog of databases */ res = PQexec(conn, "DECLARE articoli CURSOR FOR select cdart from base_a_artico ORDER BY cdart"); if (PQresultStatus(res) != PGRES_COMMAND_OK) { fprintf(stderr, "DECLARE CURSOR failed: %s", PQerrorMessage(conn)); PQclear(res); exit_nicely(conn); } PQclear(res); res = PQexec(conn, "FETCH ALL in articoli"); if (PQresultStatus(res) != PGRES_TUPLES_OK) { fprintf(stderr, "FETCH ALL failed: %s", PQerrorMessage(conn)); PQclear(res); exit_nicely(conn); } /* next, print out the rows */ for (i = 0; i < PQntuples(res); i++) { read_rigpia(conn, PQgetvalue(res, i, 0)); } PQclear(res); /* close the portal ... we don't bother to check for errors ... */ res = PQexec(conn, "CLOSE articoli"); PQclear(res); /* end the transaction */ res = PQexec(conn, "END"); PQclear(res); /* close the connection to the database and cleanup */ PQfinish(conn); return 0; } int read_rigpia(PGconn* conn, char* cdart) { PGresult *res; char sql[1024]; int i; char* dtfab; char* sum; memset(sql,0,sizeof(sql)); sprintf(sql,"DECLARE rigpia CURSOR FOR select dtfab,sum(qtfan-qtpro) from adp_d_rigpia where flsta='' and cdart='%s' and qtfan>qtpro and cddpu not in ('04','05','06','07','08','09', '91','92','93','94','95','96','97','98','A0','B8','C2','LF','SC') group by dtfab", cdart); res = PQexec(conn, sql); if (PQresultStatus(res) != PGRES_COMMAND_OK) { fprintf(stderr, "DECLARE CURSOR rigpia failed: %s *** %s", PQerrorMessage(conn),sql); PQclear(res); return 0; } PQclear(res); res = PQexec(conn, "FETCH ALL in rigpia"); if (PQresultStatus(res) != PGRES_TUPLES_OK) { fprintf(stderr, "FETCH ALL failed in rigpia: %s", PQerrorMessage(conn)); PQclear(res); return 0; } /* next, print out the rows */ for (i = 0; i < PQntuples(res); i++) { dtfab = PQgetvalue(res, i, 0); sum = PQgetvalue(res, i, 1); } PQclear(res); res = PQexec(conn, "CLOSE rigpia"); PQclear(res); } Regards, Andrea -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs