Greetings:
I don't know if anyone in here is encountering this problem (yet),
but it has
been affecting me for the past few weeks - ever since I upgraded my
MySQL
server to 5.0.19. It took quite a bit of digging, but I believe I
have found
the problem.
To describe the problem: when you run vpopmail in MySQL mode, with
courier-authdaemond and MySQL v5.0 or later, you will find that for
the first 8
hours, everything works just fine, but after 8 hours, nobody will be
able to
authenticate to the email server and you will see "MySQL server has
gone away"
errors in the maillog.
The cause of the problem is that in MySQL 5.0 (and probably some 4.1
releases),
MySQL implements a new timeout definition for connections, a timeout
that
ignores traffic. This timeout will shut down the socket thread from
the MySQL
side. The problem is that the client (vchkpw and friends) do not
know/understand about this timeout and socket termination so they
continue on
in ignorate bliss until they try to send to the socket and find that
it's no
longer valid - literally "the server has gone away".
The fix is to simply destroy the internal flags and file handles
related to that
socket, rebuild a new one, and try again.
The included patch (inline and attached) implements this fix. Please
note that
there doesn't appear to be any way at this time to disable the
timeout feature
in MySQL.
Please feel free to comment, tear apart, beat up, or otherwise rip to
shreads my
fix!
--
Ron Gage
(LPIC1 MCP A+ Net+)
Westland, Michigan
--- vmysql.c~ 2006-05-29 10:17:20.000000000 -0400
+++ vmysql.c 2006-05-29 10:17:20.000000000 -0400
@@ -465,7 +465,31 @@
);
if (mysql_query(&mysql_read,SqlBufRead)) {
fprintf(stderr, "vmysql: sql error[3]: %s\n",
mysql_error(&mysql_read));
- return(NULL);
+ /* Ron Gage - May 29, 2006 - With newer versions of MySQL,
there is
such a thing
+ as a connection timeout regardless of activity. By default
under MySQL
5, this
+ timeout is 28800 seconds (8 hours). If your vpopmail system
runs fine
for the
+ first 8 hours, then stops authenticating, this timeout is
your problem
(especially
+ under authdaemond).
+
+ What this code does is when an error is encountered, it
first tries to
drop and
+ rebuild a connection to the SQL server and tries again. If
this second
attempt
+ fails, then something other than the connection timeout is
the problem.
This fix
+ need to be implemented in other places but in my setup
(Slackware 10.2,
netqmail,
+ vpopmail, courier-authdaemond, courier-imapd and a few
others), this is
always where
+ the auth attempt died with a "SQL server has gone away" error.
+ */
+
+ fprintf(stderr, "Attempting to rebuild connection to SQL
server\n");
+ vclose();
+ verrori = 0;
+ if ( (err=vauth_open_read()) != 0 ) {
+ verrori = err;
+ return(NULL);
+ }
+ if (mysql_query(&mysql_read, SqlBufRead)) {
+ fprintf (stderr, "vmysql: connection rebuild failed: %s\n",
mysql_error(&mysql_read));
+ return(NULL);
+ }
}
if (!(res_read = mysql_store_result(&mysql_read))) {
----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.
<vmysql.diff>