On 11-08-09 04:54 AM, Heikki Vatiainen wrote: > On 08/05/2011 10:19 AM, Heikki Vatiainen wrote: > >>> Has anyone found any solutions/patches for the sql timeout failover >>> issue with radiator? When radiator executes an sql statement on an >>> sql server that times out not on connecting, but the statement >>> itself, radiator disconnects and reconnects to the same sql server to >>> try again. It never seems to failover to the next sql destination. >> >> Thanks for the problem description and the example code in your other >> message. I will get back to this once I get the comments from the >> development team. > > Can you provide a patch for this? That would make sure we have your > version of the fix corretly understood.
I'm still testing/monitoring it to. So far, it will just alternate between the first 2 sql sources. I have 4. I wanted to keep the 1st sql source preferred. My patch may not be a desired solution, but here it is: --- Radiator-4.8+patches/Radius/SqlDb.pm 2011-04-27 17:21:51.000000000 -0400 +++ Radiator-4.8+patches+custom/Radius/SqlDb.pm 2011-08-11 09:10:50.000000000 -0400 @@ -121,6 +121,16 @@ sub initialize $self->{SQLRetries} = 2; $self->{FailureBackoffTime} = 600; # Seconds $self->{DateFormat} = '%b %e, %Y %H:%M'; # eg 'Sep 3, 1995 13:37' + $self->{DBCur} = '-'; # keep track of the current (or if disconnected, previous) source. + + $self->set("ConnectionHook", + 'sub { + my $self = shift; + my $dbsource = ( split(/;/,$self->{dbname}) )[0]; + + # If an sql connection occurs, log it so we can see it. Could use this to find out what sql server was in use, when a failure occurs. + $self->log($main::LOG_WARNING, "SQL connected to DBSource: ($dbsource) [$self->{Identifier}]"); + }'); $self->set("ConnectionAttemptFailedHook", 'sub { @@ -170,6 +180,12 @@ sub reconnect $dbsource = &Radius::Util::format_special($dbsource, undef, $self); $dbusername = &Radius::Util::format_special($dbusername, undef, $self); $dbauth = &Radius::Util::format_special($dbauth, undef, $self); + + # since reconnect always starts from the 1st DBSource, never reconnect to the 1st DBSource if the current/previous sql server (DBCur) matches. + # this should prevent always retrying the same server if an SQL timeout occurs, but the connection to the failing server succeeds. + next if $self->{DBCur} eq $dbsource; + $self->{DBCur} = $dbsource; + $self->{dbname} = "$dbsource;$dbusername;$dbauth"; return 1 if $Radius::SqlDb::handles{$self->{dbname}}; > There is also the question of possible problems with backwards > compatibility. Currently Radiator does not advance to the next server if > there's a timeout with the query. This change would extend the timeout > behaviour from connections to queries too. > > Does anyone see problems with this? Should be made optional? Comments > would be appreciated. > >> Can you tell why the problem occurred? Was the DB server having IO >> problems? I'm just curious to know how this happens and how frequent the >> problem might be. Yes, I think it was I/O problems. Not always, but a lot of times at 6:25am (Debian Lenny), when the daily cron runs. The timeout issue is not a Radiator problem. It's an os/system/sql problem. Only thing i was asking about, is if radiator should have a different response to an SQL timeout error. Happens 2-3 times a day, but sometimes 0. > > Michael, do you have any comments on this? > >>> So, having multiple sql sources seems to be irrelevant with the issue >>> of statement time outs. >> >> That is currently true. > yes, multiple sql sources is an irrelevance for sql timeout issues. It will just constantly re-connect to the first sql source. _______________________________________________ radiator mailing list radiator@open.com.au http://www.open.com.au/mailman/listinfo/radiator