Hello,

I made a patch for server reconnect -- I had no access to a computer with redis lib installed for the moment, hopefully it compiles. If you can try and tell the result, it would be great, I can commit then.

Cheers,
Daniel

On 1/16/12 12:15 PM, Javier Gallart wrote:
Hi Daniel

On Mon, Jan 16, 2012 at 9:47 AM, Daniel-Constantin Mierla <mico...@gmail.com <mailto:mico...@gmail.com>> wrote:

    Hello,


    On 1/13/12 12:27 PM, Javier Gallart wrote:
    Hi Daniel

    both values are null.
    ok, could be a hint that the connection is down and try a
    reconnect...


    I might have found something: apparently some of the sockets
    kamailio->redis were inactive for a while and were being closed
    in the redis end.

    Do you know if there is a keepalive mechanism that reddis offers,
    or a command to set the timeout value from the client side?


In redis config file the only related value I've seen is "timeout". If set to 0, the server never disconnects inactive clients. From the client perspective, what about this: http://www.redis.io/commands/ping

Regards

Javi

    Cheers,
    Daniel

    This is redis default config:
    # Close the connection after a client is idle for N seconds (0 to
    disable)
    timeout 600

    I've set the timeout value to 0 to confirm if this is actually
    the problem.

    In case it might be useful for somebody, we've used lsof in
    recurrent mode to monitor the sockets status:

    server# lsof -i :6379 -r 5"m===%T==="  | grep -e == -e kamailio
    ===05:28:26===
    kamailio  13365 kamailio    4u  IPv4  58622      0t0  TCP
    localhost:34994->localhost:6379 (ESTABLISHED)
    kamailio  13366 kamailio    4u  IPv4  58626      0t0  TCP
    localhost:34995->localhost:6379 (ESTABLISHED)
    kamailio  13367 kamailio    4u  IPv4  58628      0t0  TCP
    localhost:34996->localhost:6379 (ESTABLISHED)
    kamailio  13368 kamailio    4u  IPv4  58632      0t0  TCP
    localhost:34997->localhost:6379 (ESTABLISHED)
    kamailio  13369 kamailio    4u  IPv4  58649      0t0  TCP
    localhost:35000->localhost:6379 (ESTABLISHED)
    kamailio  13370 kamailio    4u  IPv4  58661      0t0  TCP
    localhost:35003->localhost:6379 (ESTABLISHED)
    kamailio  13376 kamailio   10u  IPv4  58710      0t0  TCP
    localhost:35013->localhost:6379 (ESTABLISHED)
    kamailio  13377 kamailio    4u  IPv4  58705      0t0  TCP
    localhost:35012->localhost:6379 (ESTABLISHED)
    kamailio  13378 kamailio    4u  IPv4  58695      0t0  TCP
    localhost:35008->localhost:6379 (ESTABLISHED)
    kamailio  13381 kamailio    4u  IPv4  58691      0t0  TCP
    localhost:35006->localhost:6379 (ESTABLISHED)
    kamailio  13382 kamailio    4u  IPv4  58693      0t0  TCP
    localhost:35007->localhost:6379 (ESTABLISHED)
    ===05:28:31===
    kamailio  13365 kamailio    4u  IPv4  58622      0t0  TCP
    localhost:34994->localhost:6379 (ESTABLISHED)
    kamailio  13366 kamailio    4u  IPv4  58626      0t0  TCP
    localhost:34995->localhost:6379 (CLOSE_WAIT)
    kamailio  13367 kamailio    4u  IPv4  58628      0t0  TCP
    localhost:34996->localhost:6379 (ESTABLISHED)
    kamailio  13368 kamailio    4u  IPv4  58632      0t0  TCP
    localhost:34997->localhost:6379 (CLOSE_WAIT)
    kamailio  13369 kamailio    4u  IPv4  58649      0t0  TCP
    localhost:35000->localhost:6379 (CLOSE_WAIT)
    kamailio  13370 kamailio    4u  IPv4  58661      0t0  TCP
    localhost:35003->localhost:6379 (CLOSE_WAIT)
    kamailio  13376 kamailio   10u  IPv4  58710      0t0  TCP
    localhost:35013->localhost:6379 (CLOSE_WAIT)
    kamailio  13377 kamailio    4u  IPv4  58705      0t0  TCP
    localhost:35012->localhost:6379 (CLOSE_WAIT)
    kamailio  13378 kamailio    4u  IPv4  58695      0t0  TCP
    localhost:35008->localhost:6379 (CLOSE_WAIT)
    kamailio  13381 kamailio    4u  IPv4  58691      0t0  TCP
    localhost:35006->localhost:6379 (CLOSE_WAIT)
    kamailio  13382 kamailio    4u  IPv4  58693      0t0  TCP
    localhost:35007->localhost:6379 (CLOSE_WAIT)

    Regards

    Javi

    On Fri, Jan 13, 2012 at 9:35 AM, Daniel-Constantin Mierla
    <mico...@gmail.com <mailto:mico...@gmail.com>> wrote:

        Hello,


        On 1/13/12 8:00 AM, Javier Gallart wrote:

            Hi all

            I have started making some tests with the ndb_redis
            module. So far we have not stressed the module (no more
            than 5 HGET  commands/second at maximum). It works well,
            but with at some point it starts failing. The failures
            are easily found because the logs always show this:
            INFO: <core> [main.c:811]: INFO: signal 13 received

        this due to a broken connection. What do you get in redis
        reply and info variables?


            After that the redis value is always null. If I restart
            kamailio it starts working again.
            I've run kamailio with debug=4 but I haven't seen more
            useful information. On the redis side, I could find
            nothing in the logs either, the number of clientes
            connected is alway much less than the configured maximum,
            Any idea?
            On the other hand, if I restart redis we need to restart
            kamailio to restore the connections. Is the reconnection
            to redis on the roadmap?


        It should not be that complex, there is the code for
        initializing the connection, it should be reused for doing it
        again in case of failure.

        Cheers,
        Daniel

-- Daniel-Constantin Mierla -- http://www.asipto.com
        http://linkedin.com/in/miconda -- http://twitter.com/miconda




    _______________________________________________
    SIP Express Router (SER) and Kamailio (OpenSER) - sr-users mailing list
    sr-users@lists.sip-router.org  <mailto:sr-users@lists.sip-router.org>
    http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-users

-- Daniel-Constantin Mierla --http://www.asipto.com
    http://linkedin.com/in/miconda  -- http://twitter.com/miconda




_______________________________________________
SIP Express Router (SER) and Kamailio (OpenSER) - sr-users mailing list
sr-users@lists.sip-router.org
http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-users

--
Daniel-Constantin Mierla -- http://www.asipto.com
http://linkedin.com/in/miconda -- http://twitter.com/miconda

diff --git a/modules/ndb_redis/redis_client.c b/modules/ndb_redis/redis_client.c
index 9f4ffc4..f477f92 100644
--- a/modules/ndb_redis/redis_client.c
+++ b/modules/ndb_redis/redis_client.c
@@ -199,6 +199,62 @@ redisc_server_t *redisc_get_server(str *name)
 /**
  *
  */
+int redisc_reconnect_server(redisc_server_t *rsrv)
+{
+       char *addr;
+       unsigned int port, db;
+       redisc_server_t *rsrv=NULL;
+       param_t *pit = NULL;
+       struct timeval tv;
+
+       tv.tv_sec = 1;
+       tv.tv_usec = 0;
+       addr = "127.0.0.1";
+       port = 6379;
+       db = 0;
+       for (pit = rsrv->attrs; pit; pit=pit->next)
+       {
+               if(pit->name.len==4 && strncmp(pit->name.s, "addr", 4)==0) {
+                       addr = pit->body.s;
+                       addr[pit->body.len] = '\0';
+               } else if(pit->name.len==4 && strncmp(pit->name.s, "port", 
4)==0) {
+                       if(str2int(&pit->body, &port) < 0)
+                               port = 6379;
+               } else if(pit->name.len==2 && strncmp(pit->name.s, "db", 2)==0) 
{
+                       if(str2int(&pit->body, &db) < 0)
+                               db = 0;
+               }
+       }
+       if(rsrv->ctxRedis!=NULL) {
+               rsrv->ctxRedis = NULL;
+               redisFree(rsrv->ctxRedis);
+       }
+
+       rsrv->ctxRedis = redisConnectWithTimeout(addr, port, tv);
+       if(!rsrv->ctxRedis)
+               goto err;
+       if (rsrv->ctxRedis->err)
+               goto err2;
+       if (redisCommandNR(rsrv->ctxRedis, "PING"))
+               goto err2;
+       if (redisCommandNR(rsrv->ctxRedis, "SELECT %i", db))
+               goto err2;
+
+       return 0;
+
+err2:
+       LM_ERR("error communicating with redis server [%.*s] (%s:%d/%d): %s\n",
+               rsrv->sname->len, rsrv->sname->s, addr, port, db, 
rsrv->ctxRedis->errstr);
+       return -1;
+err:
+       LM_ERR("failed to connect to redis server [%.*s] (%s:%d/%d)\n",
+               rsrv->sname->len, rsrv->sname->s, addr, port, db);
+       return -1;
+}
+
+/**
+ *
+ */
 int redisc_exec(str *srv, str *cmd, str *argv1, str *argv2, str *argv3,
                str *res)
 {
@@ -237,6 +293,14 @@ int redisc_exec(str *srv, str *cmd, str *argv1, str 
*argv2, str *argv3,
        c = cmd->s[cmd->len];
        cmd->s[cmd->len] = '\0';
        rpl->rplRedis = redisCommand(rsrv->ctxRedis, cmd->s);
+       if(rpl->rplRedis == NULL)
+       {
+               /* null reply, reconnect and try again */
+               if(redisc_reconnect_server(rsrv)==0)
+               {
+                       rpl->rplRedis = redisCommand(rsrv->ctxRedis, cmd->s);
+               }
+       }
        cmd->s[cmd->len] = c;
        return 0;
 }
_______________________________________________
SIP Express Router (SER) and Kamailio (OpenSER) - sr-users mailing list
sr-users@lists.sip-router.org
http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-users

Reply via email to